PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/ML System Design/Jane Street

Design sequential reveal classification and policy

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of sequential partial-observation evaluation, mask-value selection and reward optimization, augmentation strategies for robustness, and early-exit policy design for classifiers under progressively revealed inputs, testing competencies in model calibration, evaluation metrics, distribution-shift reasoning, and trade-off analysis in the ML System Design domain at both conceptual and practical application levels. It is commonly asked because it probes system-level thinking about information sufficiency, metric-driven trade-offs between accuracy and masked information, robustness to masking and augmentation, and the ability to design and calibrate stopping and confidence policies for streaming or cost-sensitive inference pipelines.

  • hard
  • Jane Street
  • ML System Design
  • Software Engineer

Design sequential reveal classification and policy

Company: Jane Street

Role: Software Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Technical Screen

You are given a trained CNN for FashionMNIST and an evaluation notebook. 1) Implement a row-wise reveal evaluation: at step k, the top k rows are visible and the remaining rows are replaced by a fixed mask value m; record the model’s prediction for each k and compute accuracy versus k across the test set. Plot accuracy vs k and explain what the curve tells you about information sufficiency and robustness. 2) Define a reward R for partially revealed images: if the model’s final prediction is correct when you stop, R equals the number of pixels still masked (not yet revealed); otherwise R = 0. Using the accuracy–k results, propose and implement a method to pick a single global mask fill value m that maximizes expected reward over the dataset (e.g., sweep candidate m values, estimate expected reward for each, and select the best). Discuss trade-offs such as class imbalance and distribution shift from masking. 3) Improve the expected reward via training-time augmentation that masks contiguous rows/blocks so the model learns to be accurate with limited visible pixels. Specify the augmentation policy (probability, region size range, fill value), how you would constrain randomness to avoid degenerate cases (e.g., masking almost all pixels), and how you would tune the policy. If limited to only two retraining runs, state the exact two configurations you would try and the metrics you would compare (accuracy-vs-k and expected reward). 4) With the trained model fixed and pixels revealed sequentially at test time (1 pixel, 2 pixels, … full image), design an early-exit policy that decides when to output to maximize expected reward R. Propose a concrete strategy such as requiring the argmax class to be stable within a sliding window of the last W steps and/or exceed a confidence threshold; describe how to set W and thresholds via offline calibration, and how to handle ties or oscillations.

Quick Answer: This question evaluates understanding of sequential partial-observation evaluation, mask-value selection and reward optimization, augmentation strategies for robustness, and early-exit policy design for classifiers under progressively revealed inputs, testing competencies in model calibration, evaluation metrics, distribution-shift reasoning, and trade-off analysis in the ML System Design domain at both conceptual and practical application levels. It is commonly asked because it probes system-level thinking about information sufficiency, metric-driven trade-offs between accuracy and masked information, robustness to masking and augmentation, and the ability to design and calibrate stopping and confidence policies for streaming or cost-sensitive inference pipelines.

Related Interview Questions

  • Predict future time-series values - Jane Street (hard)
  • Design an end-to-end training framework - Jane Street (hard)
Jane Street logo
Jane Street
Sep 6, 2025, 12:00 AM
Software Engineer
Technical Screen
ML System Design
12
0

FashionMNIST: Row-wise Reveal Evaluation, Reward-Optimal Masking, Augmentation, and Early Exit

Context

You have a trained CNN classifier for FashionMNIST. Each image is grayscale, shaped 1×28×28, and normalized (e.g., to [0, 1] or standard z-score). You also have an evaluation notebook where you can run batched inference across the test set.

Assume a row-wise reveal protocol: at step k (0 ≤ k ≤ 28), the top k rows are visible and the remaining rows are replaced by a constant mask fill value m. You will analyze how much information in the early rows suffices for accurate classification, choose a global mask value m to maximize a defined reward, propose a training augmentation policy to improve that reward, and design an early-exit policy when pixels are revealed sequentially at test time.

Tasks

  1. Row-wise reveal evaluation
    • For each k in {0, 1, …, 28}, construct masked images where rows [k, 27] are filled with a scalar m and rows [0, k−1] are kept from the original image.
    • Record the model’s predicted class at each k; compute accuracy versus k across the test set.
    • Plot accuracy vs k and explain what the curve indicates about information sufficiency (how early rows suffice) and robustness to masking.
  2. Reward-optimal global mask value m
    • Define reward R for partially revealed images: if you stop revealing at some k and the model’s final prediction is correct, R equals the number of pixels still masked; otherwise R = 0.
    • Using the accuracy–k results, propose and implement a method to pick a single global mask fill value m that maximizes expected reward over the dataset. For example, sweep candidate m values, estimate expected reward for each (under a simple fixed-k stopping policy), and select the best.
    • Discuss trade-offs, including class imbalance and the distribution shift introduced by masking.
  3. Training-time augmentation to improve expected reward
    • Propose an augmentation that masks contiguous rows/blocks during training so the model learns to be accurate with limited visible pixels.
    • Specify the policy: probability of applying, region size range, fill value; constraints to avoid degenerate cases (e.g., masking almost all pixels), and how you would tune the policy.
    • If limited to only two retraining runs, state the exact two configurations you would try and which metrics you would compare (accuracy-vs-k and expected reward).
  4. Early-exit policy for sequential pixel reveal
    • With the trained model fixed and pixels revealed sequentially at test time (1 pixel, 2 pixels, …, 784 pixels), propose an early-exit policy that decides when to output to maximize expected reward R.
    • Provide a concrete strategy, such as requiring the argmax class to be stable within a sliding window of the last W steps and/or exceed a calibrated confidence threshold.
    • Describe how to set W and thresholds via offline calibration, and how to handle ties or oscillations.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Jane Street•More Software Engineer•Jane Street Software Engineer•Jane Street ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.