Design sequential reveal classification and policy

Q: Design sequential reveal classification and policy

This is a ML System Design interview question from Jane Street for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

FashionMNIST: Row-wise Reveal Evaluation, Reward-Optimal Masking, Augmentation, and Early Exit

Context

You have a trained CNN classifier for FashionMNIST. Each image is grayscale, shaped 1×28×28, and normalized (e.g., to [0, 1] or standard z-score). You also have an evaluation notebook where you can run batched inference across the test set.

Assume a row-wise reveal protocol: at step k (0 ≤ k ≤ 28), the top k rows are visible and the remaining rows are replaced by a constant mask fill value m. You will analyze how much information in the early rows suffices for accurate classification, choose a global mask value m to maximize a defined reward, propose a training augmentation policy to improve that reward, and design an early-exit policy when pixels are revealed sequentially at test time.

Tasks

Row-wise reveal evaluation
- For each k in {0, 1, …, 28}, construct masked images where rows [k, 27] are filled with a scalar m and rows [0, k−1] are kept from the original image.
- Record the model’s predicted class at each k; compute accuracy versus k across the test set.
- Plot accuracy vs k and explain what the curve indicates about information sufficiency (how early rows suffice) and robustness to masking.
Reward-optimal global mask value m
- Define reward R for partially revealed images: if you stop revealing at some k and the model’s final prediction is correct, R equals the number of pixels still masked; otherwise R = 0.
- Using the accuracy–k results, propose and implement a method to pick a single global mask fill value m that maximizes expected reward over the dataset. For example, sweep candidate m values, estimate expected reward for each (under a simple fixed-k stopping policy), and select the best.
- Discuss trade-offs, including class imbalance and the distribution shift introduced by masking.
Training-time augmentation to improve expected reward
- Propose an augmentation that masks contiguous rows/blocks during training so the model learns to be accurate with limited visible pixels.
- Specify the policy: probability of applying, region size range, fill value; constraints to avoid degenerate cases (e.g., masking almost all pixels), and how you would tune the policy.
- If limited to only two retraining runs, state the exact two configurations you would try and which metrics you would compare (accuracy-vs-k and expected reward).
Early-exit policy for sequential pixel reveal
- With the trained model fixed and pixels revealed sequentially at test time (1 pixel, 2 pixels, …, 784 pixels), propose an early-exit policy that decides when to output to maximize expected reward R.
- Provide a concrete strategy, such as requiring the argmax class to be stable within a sliding window of the last W steps and/or exceed a calibrated confidence threshold.
- Describe how to set W and thresholds via offline calibration, and how to handle ties or oscillations.

Design sequential reveal classification and policy

FashionMNIST: Row-wise Reveal Evaluation, Reward-Optimal Masking, Augmentation, and Early Exit

Context

Tasks

Solution (Locked)

Comments (0)