You’re given a 6‑hour take‑home to “predict a product’s target users,” but you estimate a thorough solution needs 24+ hours. Produce a written plan (≤1 page) that you would send to the hiring manager and business partner before starting. Include: (1) crisp problem statement, success criteria, and explicit non‑goals; (2) a prioritized work breakdown with specific deliverables you will complete in 6 hours and what you will defer; (3) the minimum data you need and the clarifying questions you would ask up front; (4) risks (e.g., label ambiguity, leakage, class imbalance) and your mitigations; (5) the exact storyboard of your presentation (titles of slides and key figures); and (6) how you will defend trade‑offs in Q&A if challenged on not exploring an attractive but out‑of‑scope idea (e.g., feature learning or causal inference). Be specific about how you will demonstrate impact with limited time, and how you will measure whether your chosen scope was correct after the fact.

# 6‑Hour Plan: Predict Product Target Users (for pre‑alignment) 1) Problem statement, success criteria, non‑goals - Problem: Predict which accounts are likely to become “target users” of Product P in the next 90 days to prioritize GTM outreach. Label = account activates Product P by meeting a clear threshold (e.g., feature flag ON + 3 successful events) within 90 days of a reference date. - Objective: Produce a baseline, interpretable model and a top‑K ranked list with measurable lift over a simple heuristic. - Success criteria (6‑hr scope): - Offline: AUC ≥ 0.70 and Lift@Top10% ≥ 2.0 over a naive baseline (e.g., recent activity rank). Also report PR‑AUC and calibration. - Business proxy: For Top 1,000 accounts, Precision@K ≥ 3× random; estimated incremental conversions and revenue using historical base rates. - Deliverables: Notebook, 8–10 slide deck, top‑K list with rationale, risk checks, backlog. - Non‑goals (explicitly out of scope in 6 hours): Deep feature learning, causal inference/uplift modeling, extensive hyperparameter tuning, productionization, AB test launch, complex segmentation research. 2) Prioritized work breakdown (6 hours) and deferments - 0:00–0:30 Align + data access check - Deliverable: 5‑line written alignment on label, horizon, action owner, top‑K size, baseline comparator. - 0:30–1:30 Data audit + label construction - Deliverable: Reproducible labeling logic; temporal split (train: older periods; test: most recent window). - 1:30–3:00 Baseline model(s) - Approach: Logistic regression and gradient‑boosted trees (shallow). Features: recency/frequency, account age, size, prior product engagements, vertical, simple aggregates over 7/30/90 days. Class weighting for imbalance. - Deliverable: Trained models; cross‑validated metrics; guardrail checks. - 3:00–4:00 Evaluation + interpretability - Deliverable: ROC/PR curves, Lift@K, calibration plot, top feature importances/coefficients, partial dependence on top drivers. - 4:00–5:00 Business translation - Deliverable: Top‑K list with expected positives = p̂ × K; revenue back‑of‑envelope; comparison to heuristic; sensitivity by segment. - 5:00–6:00 Slides + handoff - Deliverable: Slide deck, risks/mitigations, next‑step backlog with ROI; reproducibility notes. - Defer to 24+ hours: Rich feature engineering (embeddings, sequences), hyperparameter sweeps, leakage‑robust feature store, uplift/causal analysis, production scoring pipeline, monitoring, fairness deep‑dive, multi‑model ensembling, segmentation discovery. 3) Minimum data + clarifying questions - Minimum data - Accounts: id, create_date, segment/vertical, size proxies, region. - Product P events/flags: activation flag with timestamp, usage counts by day, key milestones. - Activity: core platform events aggregated by day (7/30/90‑day windows). - Outcomes: target label within 90 days from reference date. - Exclusions: indicators of future knowledge to avoid leakage (post‑label fields), marketing/sales touches with timestamps to optionally exclude. - Period coverage: ≥ 12 months to allow temporal split and seasonality. - Clarifying questions - Define “target user”: exact activation threshold; edge cases (trial, partial enablement). - Horizon and cadence: 90‑day OK? How often will scores be used? Who acts on them and how many accounts can they handle weekly? - Baselines: What heuristic do you use today? What is current conversion rate and revenue per conversion? - Constraints: Any legal/privacy limits? Any must‑include or must‑exclude fields? - Success: Preferred decision threshold or list size? Any critical segments to highlight (e.g., enterprise vs SMB)? 4) Key risks and mitigations - Label ambiguity: Co‑define activation threshold; run sensitivity on thresholds; document final choice. - Leakage: Strict temporal split; drop features updated after reference date; exclude post‑contact marketing/sales touches. - Class imbalance: Class weights or focal loss; evaluate PR‑AUC and Lift@K; calibrate probabilities (Platt/isotonic). - Selection bias: Ensure negatives include both active and dormant accounts; compare segment‑wise metrics. - Drift/seasonality: Use most recent window as test; report by month; plan for periodic retrain. - ID issues: Deduplicate accounts; consistent ID mapping if multiple identifiers exist. - Missingness: Simple imputation with missingness indicators; sanity checks on extreme values. 5) Presentation storyboard (titles + key figures) - Slide 1: Problem, scope, and decision we enable (one‑pager) - Slide 2: Definitions, label, and timeline (label schematic) - Slide 3: Data and guardrails (temporal split diagram; leakage controls) - Slide 4: Modeling approach (feature groups; simple model stack) - Slide 5: Performance summary (ROC/PR; Lift@Top10%) - Slide 6: Business impact (Top‑K precision; expected wins; revenue estimate) - Slide 7: What drives predictions (top features; partial dependence) - Slide 8: Segment sensitivity and

How do I approach Behavioral & Leadership interview questions?

Behavioral & Leadership questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master behavioral & leadership interviews.

What difficulty level is this interview question?

This is a medium difficulty Behavioral & Leadership question, commonly asked during Technical Screen rounds at Stripe.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Stripe during technical interviews.

Scope an open‑ended take‑home under constraints

Q: Scope an open‑ended take‑home under constraints

This question evaluates a candidate's competency in scoping and prioritizing a data‑science take‑home, stakeholder communication, risk identification, and trade‑off defense under strict time constraints.

Take‑Home Planning Prompt: Predict Target Users in 6 Hours

Context

You have a 6‑hour take‑home assignment to plan how you would predict a product’s target users. You believe a thorough solution would take 24+ hours, so you must scope for maximum impact within 6 hours and communicate what you will deliver and what you will defer.

Task

Produce a written plan (≤1 page) that you would send to the hiring manager and business partner before starting. Include:

A crisp problem statement, success criteria, and explicit non‑goals.
A prioritized work breakdown with specific deliverables you will complete in 6 hours and what you will defer.
The minimum data you need and the clarifying questions you would ask up front.
Risks (e.g., label ambiguity, leakage, class imbalance) and your mitigations.
The exact storyboard of your presentation (titles of slides and key figures).
How you will defend trade‑offs in Q&A if challenged on not exploring an attractive but out‑of‑scope idea (e.g., feature learning or causal inference). Be specific about how you will demonstrate impact with limited time, and how you will measure whether your chosen scope was correct after the fact.

Take‑Home Planning Prompt: Predict Target Users in 6 Hours

Context

Task

Produce a written plan (≤1 page) that you would send to the hiring manager and business partner before starting. Include:

A crisp problem statement, success criteria, and explicit non‑goals.
A prioritized work breakdown with specific deliverables you will complete in 6 hours and what you will defer.
The minimum data you need and the clarifying questions you would ask up front.
Risks (e.g., label ambiguity, leakage, class imbalance) and your mitigations.
The exact storyboard of your presentation (titles of slides and key figures).
How you will defend trade‑offs in Q&A if challenged on not exploring an attractive but out‑of‑scope idea (e.g., feature learning or causal inference). Be specific about how you will demonstrate impact with limited time, and how you will measure whether your chosen scope was correct after the fact.

Scope an open‑ended take‑home under constraints

Quick Overview

Take‑Home Planning Prompt: Predict Target Users in 6 Hours

Context

Task

Solution

Submit Your Answer to Earn 20XP

Scope an open‑ended take‑home under constraints

Quick Overview

Take‑Home Planning Prompt: Predict Target Users in 6 Hours

Context

Task

Solution

Submit Your Answer to Earn 20XP