Diagnose metric anomalies and evaluate new algorithm
Company: Airtable
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: easy
Interview Round: Technical Screen
## Context
You work on a consumer product that includes an **AI calling** feature (users trigger calls; the system places AI-assisted calls). The team monitors operational and product metrics daily.
Assume you have access to:
- Event logs (requests, successes/failures, latency)
- User/device/app version, geo, acquisition channel
- Experiment assignments / feature flags
- Recent change log (deploys, config changes, marketing campaigns)
- Basic dashboards and the ability to query raw data
## Part A — Sudden spike
Today you notice **AI call count is much higher than normal** (e.g., +60% day-over-day).
1) What is your **step-by-step investigation plan** for identifying the cause?
2) How do you determine whether it’s a **real product change** vs a **data/measurement issue**?
3) What **follow-up actions** would you recommend depending on the root cause (e.g., rollback, rate limits, alerting changes, comms)?
## Part B — Sudden drop
Another day you notice a key metric has **dropped sharply** (pick a concrete example such as conversion rate, call success rate, revenue per user, or retention).
1) How do you **triage** the issue (what do you check first and why)?
2) How do you localize the problem by **segment** (geo, app version, device, cohort, channel, experiment cell)?
3) What are common **confounders** and **false alarms** you’d guard against (seasonality, reporting lag, instrumentation changes, Simpson’s paradox)?
## Part C — Launching a new algorithm
You plan to launch a **new algorithm** (e.g., ranking, routing, spam detection, call-quality model).
1) How do you decide whether the new algorithm is “better”?
2) Propose **offline metrics** and **online metrics**, including a **primary metric**, **diagnostic metrics**, and **guardrail metrics**.
3) Describe an online evaluation plan (e.g., A/B test or phased rollout), including:
- Eligibility and randomization unit
- Success criteria and stopping rules
- Handling delayed outcomes and interference/network effects
- What you would do if metrics move in opposite directions (trade-offs)
Provide your reasoning, assumptions, and concrete checks you would run.
Quick Answer: This question evaluates competency in analytics-driven incident investigation, experiment design, metric instrumentation, and product-metric interpretation for data science roles.