Describe ownership in ambiguous, messy data work
Company: Citi
Role: Data Scientist
Category: Behavioral & Leadership
Difficulty: medium
Interview Round: Technical Screen
Behavioral/ownership interview questions for a remote Product/Risk Data Scientist role:
1. Tell me about a time **data changed a decision**. What was the decision, what analysis did you run, and how did you influence stakeholders?
2. Tell me about a time when **data quality was bad** (missing, inconsistent, or untrustworthy). What did you do, and how did you prevent recurrence?
3. How do you **prioritize ambiguous requests** from PM/Eng/Legal when there is no clear owner and timelines are tight?
Answer with concrete examples, emphasizing independence, judgment, and cross-functional collaboration.
Quick Answer: This question evaluates ownership, decision-making with imperfect data, data-quality triage, and cross-functional prioritization skills within the Behavioral & Leadership category for Data Science roles.
Solution
## 1) “A time data changed a decision” (what good looks like)
Use a STAR-style narrative with clear business impact.
- **Situation/Task:** Define the decision context (e.g., launch a risk rule, change onboarding, adjust limits).
- **Action (analysis):**
- Clarify metric definitions and counterfactual.
- Use the right tool: experiment (preferred), quasi-experiment (DiD/PSM), or observational with strong caveats.
- Show how you handled confounding (seasonality, channel mix, geo policy changes).
- Provide uncertainty: confidence intervals, sensitivity checks.
- **Result:** Decision changed (ship/no-ship/targeted rollout). Quantify impact (fraud down X%, activation up Y%, reviews down Z%).
- **Influence:** How you aligned PM/Eng/Legal—pre-read doc, clear recommendation, tradeoffs, and an execution plan.
Example structure:
- “We planned to relax a restriction to increase conversions. My analysis showed conversions would rise +0.8pp, but chargebacks +0.3pp concentrated in one channel. We launched only for low-risk cohorts and added a step-up check for that channel, preserving most upside while containing risk.”
## 2) “When data quality was bad”
Interviewers want judgment + process, not heroics.
What to cover:
1. **Triage:** assess severity (is it a logging gap, ETL bug, definition mismatch?).
2. **Mitigation:**
- Use alternative sources (raw event logs, ledger tables) and triangulate.
- Communicate uncertainty; avoid false precision.
3. **Root cause & prevention:**
- Add data tests (schema checks, freshness, volume/anomaly detection).
- Create a single source of truth definition (metric spec).
- Partner with Eng to fix instrumentation; add dashboards/alerts.
4. **Postmortem:** document what happened and change the process (runbooks, ownership).
A strong answer explicitly states when you *stopped* analysis because data couldn’t support it and what you did next.
## 3) “How do you prioritize ambiguous requests?”
Provide a repeatable framework:
1. **Clarify objective and decision:** “What decision will this change, by when, and what happens if we do nothing?”
2. **Estimate impact vs effort vs risk:**
- Impact: revenue, activation, fraud loss, compliance exposure
- Effort: analyst/engineer time, dependencies
- Risk: user harm, regulatory exposure, opportunity cost
3. **Define a thin-slice MVP:** the smallest analysis that reduces uncertainty for the decision.
4. **Align stakeholders:** write a short prioritization note; explicitly trade off lower-priority asks.
5. **Set SLAs and ownership (remote-friendly):** who approves definitions, where results live, how updates are communicated.
Strong remote-culture signal:
- Proactively document assumptions, decisions, and next steps in an async doc; run a short sync only to unblock disagreements.