Derive and compare core ML and RL methods

Q: Derive and compare core ML and RL methods

This question evaluates a candidate's mastery of core machine learning and reinforcement learning concepts, including optimization (gradient methods and batch‑size trade‑offs), supervised versus unsupervised algorithms, policy‑gradient RL and variance reduction, deep RL stability techniques, sequence model complexity (Transformers vs RNNs), embeddings and polysemy, and low‑compute fine‑tuning strategies. It is commonly asked to probe both theoretical understanding and practical engineering judgment about convergence, variance, computational and memory complexity, representation learning, and resource‑constrained model adaptation; the domain is Machine Learning and the assessment spans conceptual understanding and practical application.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

ML Fundamentals Technical Screen — Multi‑part Question

Context: You are given a set of core machine learning topics to address rigorously. For each part, state assumptions, give equations, reason about trade‑offs, and compute requested quantities.

Gradient methods

Given an empirical risk L(w) = (1/n) ∑_{i=1..n} ℓ_i(w), derive the update rules for: a) Full‑batch gradient descent (GD) b) Stochastic gradient descent (SGD) and mini‑batch SGD
Compare convergence properties, gradient variance, and wall‑clock efficiency. Explain when SGD outperforms GD.

Batch size and steps

Define batch size.
With n = 50,000 samples, epochs = 5: a) For batch size b = 200, compute updates per epoch and total updates. b) For b = 2,000, compute new steps and propose a learning‑rate adjustment via the linear scaling rule. Explain when this rule fails or needs modification.

Supervised vs. unsupervised

Classify each algorithm and give one use case: logistic regression, SVM, k‑NN, k‑means, PCA, t‑SNE, Isolation Forest.

Reinforcement learning and policy gradients

Relate RL to supervised and unsupervised learning.
Write the REINFORCE gradient ∇θ J = E[∑_t ∇θ log πθ(a_t|s_t) G_t]. Show how a baseline b_t keeps the estimator unbiased while reducing variance.
For a length‑3 trajectory with returns G = [3, 1, −1] and score‑function terms g1, g2, g3, use a constant baseline b = mean(G) to express the sample gradient.

Deep RL integrations

Explain how neural networks are used in RL (e.g., DQN, policy gradient, actor‑critic).
For DQN, describe why target networks and experience replay stabilize training, and name a failure mode without them.

Transformers vs. RNNs

Contrast parallelism, handling of long‑range dependencies, and complexity.
For sequence length n = 1024 and model dimension d = 512, estimate asymptotic time and memory costs of self‑attention. Name two techniques that mitigate quadratic scaling.

Embeddings and polysemy

Define embeddings and polysemy.
Propose a method to distinguish the word ‘King’ in chess vs. monarchy contexts using contextual encoders or multi‑sense embeddings.
Outline one intrinsic evaluation (e.g., word sense disambiguation) and one extrinsic evaluation (e.g., downstream task accuracy).

Low‑compute fine‑tuning plan (7B model, single 24‑GB GPU)

Design a low‑compute fine‑tuning approach (e.g., QLoRA or adapters, 4‑bit quantization, gradient checkpointing, mixed precision).
Choose a LoRA rank r and specify batch size, sequence length, optimizer, and learning‑rate schedule.
Provide a back‑of‑the‑envelope estimate of trainable parameters using hidden size ≈ 4096 and ~32 layers. State assumptions about which projections you adapt.

Derive and compare core ML and RL methods

ML Fundamentals Technical Screen — Multi‑part Question

Solution

Comments (0)

Derive and compare core ML and RL methods

Overview

ML Fundamentals Technical Screen — Multi‑part Question

Solution

Comments (0)