How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a hard difficulty Machine Learning question, commonly asked during Onsite rounds at Microsoft.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Microsoft during technical interviews.

Compare CNN, RNN, and LSTM rigorously | Microsoft Interview Question

Quick Overview

This question evaluates sequence modeling competencies including comparative understanding of CNN, RNN, and LSTM inductive biases, gradient dynamics and gating mechanisms, parameter and computational trade-offs, and constrained experimental design within the Machine Learning domain focused on deep learning architectures for time-series.

Sequence Modeling: Rigorous Comparison of CNNs, RNNs, and LSTMs

Context and assumptions:

We are modeling 1D sequences of shape (batch=32, time=100, features=64).
Unless stated otherwise: vanilla RNN uses tanh activation; LSTM/GRU use standard gate definitions; biases are included; parameter counts assume a single bias per gate (we will note the common two-bias variant for completeness).

Answer all parts:

Inductive biases and use-cases

When would you prefer a 1D dilated CNN over an RNN/LSTM for time-series tasks?
When does an LSTM strictly dominate a vanilla RNN in practice?

Vanishing/exploding gradients

Write the vanilla RNN hidden-state recurrence and explain why gradients vanish or explode.
Write the LSTM gate equations (input, forget, output, cell) and explain how additive paths and gating mitigate the issue.

Parameter/computation comparison

For input shape (batch=32, time=100, features=64), compute parameter counts for: a) 1D CNN with 128 filters, kernel size 3, stride 1 (standard conv; per-filter bias). b) Single-layer unidirectional GRU with 128 hidden units. c) Single-layer unidirectional LSTM with 128 hidden units.
Show formulas and totals. Comment on parallelism and latency implications.

Experimental design under constraints

You have 50k labeled sequences and a strict latency budget (<5 ms per sample). Propose an ablation plan to choose among the above models, including regularization, data augmentation, and early stopping. Define primary metrics and stopping rules.

Quick Overview

Sequence Modeling: Rigorous Comparison of CNNs, RNNs, and LSTMs

Context and assumptions:

We are modeling 1D sequences of shape (batch=32, time=100, features=64).

Unless stated otherwise: vanilla RNN uses tanh activation; LSTM/GRU use standard gate definitions; biases are included; parameter counts assume a single bias per gate (we will note the common two-bias variant for completeness).

Answer all parts:

Inductive biases and use-cases

When would you prefer a 1D dilated CNN over an RNN/LSTM for time-series tasks?

When does an LSTM strictly dominate a vanilla RNN in practice?

Vanishing/exploding gradients

Write the vanilla RNN hidden-state recurrence and explain why gradients vanish or explode.

Write the LSTM gate equations (input, forget, output, cell) and explain how additive paths and gating mitigate the issue.

Parameter/computation comparison

For input shape (batch=32, time=100, features=64), compute parameter counts for: a) 1D CNN with 128 filters, kernel size 3, stride 1 (standard conv; per-filter bias). b) Single-layer unidirectional GRU with 128 hidden units. c) Single-layer unidirectional LSTM with 128 hidden units.

Show formulas and totals. Comment on parallelism and latency implications.

Experimental design under constraints

You have 50k labeled sequences and a strict latency budget (<5 ms per sample). Propose an ablation plan to choose among the above models, including regularization, data augmentation, and early stopping. Define primary metrics and stopping rules.

Compare CNN, RNN, and LSTM rigorously

Quick Overview

Sequence Modeling: Rigorous Comparison of CNNs, RNNs, and LSTMs

Solution

Comments (0)

Compare CNN, RNN, and LSTM rigorously

Quick Overview

Sequence Modeling: Rigorous Comparison of CNNs, RNNs, and LSTMs

Solution

Comments (0)