Implement K-means and handle train-inference mismatch

Q: Implement K-means and handle train-inference mismatch

This question evaluates understanding and hands-on competency in unsupervised clustering (K-means objective, alternating optimization, initialization strategies, empty-cluster handling, stopping criteria, and computational complexity) and sequence modeling for multi-agent trajectory prediction (input representation, attention/GNN architectures, deterministic vs probabilistic outputs, evaluation metrics, and exposure bias from autoregressive training). It is commonly asked to probe both conceptual understanding and practical implementation skills in the Machine Learning domain, testing optimization and scalability trade-offs as well as model-design and training/inference mismatch considerations at a mix of conceptual and practical application abstraction levels.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Part A — K-means (implementation + concepts)

You are given a dataset $X \in \mathbb{R}^{n \times d}$ and an integer $k$ .

Explain K-means : what objective it optimizes and the alternating optimization procedure.
Implement K-means (Lloyd’s algorithm) :
- Initialize $k$ centroids.
- Repeat until convergence / max iterations:
  - Assign each point to its nearest centroid.
  - Recompute each centroid as the mean of points assigned to it.
- Return final centroids and assignments.
Improve initialization : describe and implement a better initialization strategy than random init (i.e., K-means++ ).

Clarify how you would handle:

Empty clusters
Stopping criteria
Time complexity

Part B — Multi-agent trajectory prediction (Waymo-like)

You are building a model to predict the next 2 timestamps of a target agent (e.g., another car near the ego vehicle). For each training example you have:

Past trajectory history for the target agent for $T$ steps: $(x_t, y_t)$ for $t=1..T$
Past trajectories for nearby agents (variable number $M$ )
Map / environment context (e.g., lane polylines, traffic signals), optionally rasterized or vectorized
Ground-truth future trajectory for the target agent for the next 2 steps

Questions

Modeling : Propose an ML approach to predict the next 2 positions. Specify:
- Input representation (agent features, relative coordinates, map encoding)
- Architecture (e.g., RNN/Transformer, GNN over agents, encoder-decoder)
- Output parameterization (deterministic points vs probabilistic distribution; multimodal vs unimodal)
- Loss function(s) and evaluation metrics (e.g., ADE/FDE, NLL)
Multi-head attention (MHA) : Explain what MHA is doing in this setting and why it helps.
Autoregressive training vs inference mismatch : Suppose you train an autoregressive decoder using teacher forcing / a causal mask (each step attends only to previous steps) where the model conditions on ground-truth previous positions during training. At inference, it must condition on its own previous predictions , causing compounding error.

How would you modify training and/or the learning objective to better match inference-time behavior? Discuss at least two approaches and their tradeoffs.

Implement K-means and handle train-inference mismatch

Part A — K-means (implementation + concepts)

Part B — Multi-agent trajectory prediction (Waymo-like)

Questions

Solution

Comments (0)

Implement K-means and handle train-inference mismatch

Overview

Part A — K-means (implementation + concepts)

Part B — Multi-agent trajectory prediction (Waymo-like)

Questions

Solution

Comments (0)