Facebook And Instagram Cross-App Analytics
Asked of: Data Scientist
Last updated

What's being tested
Meta is testing whether you can reason about cross-app product analytics when two products serve similar user needs but have different audiences, social graphs, and surfaces. The interviewer is probing your ability to define the right metrics, separate correlation from causation, detect cannibalization between Instagram and Facebook, and make a launch or investment recommendation under ecosystem-level tradeoffs. Strong answers do not just say “compare engagement”; they specify whose engagement, over what time horizon, normalized by what opportunity, and with what causal design. Meta cares because product decisions in one app can shift creators, viewers, ad inventory, and time spent across the family of apps.
Core knowledge
-
Unit of analysis is the first decision: compare at the
user_id, session, story, creator, viewer, or account-family level. For cross-app questions, use a Meta account-center or identity-level user when possible, because app-levelDAUdouble-counts people active on bothInstagramandFacebook. -
Primary metrics should distinguish creator-side and viewer-side health. Creator metrics include
story_creators,stories_created_per_creator, and creation retention. Viewer metrics includestory_viewers,views_per_viewer,completion_rate, replies, reactions, hides, and exits. A single blended “usage” metric hides the two-sided marketplace dynamics. -
Opportunity-normalized metrics are essential because
InstagramandFacebookhave different surface placement and audience composition. Prefer rates like and over raw view counts. -
Cannibalization is measured at the ecosystem level, not just within one app. Track whether incremental
IG Storiesactivity reducesFB Stories,News Feed,Reels, messaging, or total Meta time. A useful metric is rather than app-local lift. -
Causal inference matters because users self-select into apps. If
IG Storiesusage exceedsFB Stories, that may reflect demographics, graph density, creator mix, UI prominence, or notification policy. Use randomized experiments where possible; otherwise consider difference-in-differences, propensity score matching, or cohort fixed effects, while clearly stating assumptions. -
Experiment randomization should usually happen at the person/account level for cross-app outcomes. Randomizing only within
Instagramrisks spillovers if treated users shift behavior onFacebook. If social interactions create interference, segment by ego-network exposure or use cluster-level designs, accepting lower power. -
Metric guardrails prevent optimizing short-term consumption at the cost of quality. Track
hide_rate,report_rate,unfollow_rate,session_satisfaction, creator churn, notification opt-outs, and downstream retention. A launch that increases story views but increases hides or reduces creator retention may be negative. -
Cohort segmentation is not optional. Compare age bands, geography, app tenure, creator/viewer status, friend/follower graph size, and cross-app overlap groups:
IG_only,FB_only, andIG_FB_overlap. Simpson’s paradox is common whenInstagramskews younger and creator-heavy whileFacebookhas different social graph norms. -
Power and MDE should be discussed for launch decisions. For a binary metric, approximate required sample per arm as where is baseline rate and is absolute MDE. With very large
DAU, tiny effects are detectable, so practical significance matters more than p-values. -
Variance reduction methods such as CUPED can improve sensitivity by adjusting for pre-period behavior: where is pre-experiment usage. This is especially useful for story viewing because engagement is highly skewed and persistent across users.
-
Time horizon changes the answer. Short-term tests may capture novelty, promotion, or onboarding effects; longer windows reveal retention and creator habit formation. For
Stories, report day-1 adoption, week-4 retention, and repeat creation/viewing rather than relying only on launch-week spikes. -
Ecosystem objective choice is the core tradeoff. If the business goal is maximizing total meaningful social interaction, then
IG Storieswinning overFB Storiesis acceptable if family-level value rises. If the goal is maintainingFacebookcreator liquidity, then cannibalization and supply-side concentration matter more.
Worked example
For “Evaluating and launching Instagram Stories”, I would start by clarifying whether we are evaluating a new launch, a major redesign, or expansion to a new market, and whether the decision criterion is Instagram growth alone or Meta-family value. I would state that my unit of randomization should be the person or account family, because the key risk is cross-app cannibalization rather than just local IG Stories lift. My answer would have four pillars: define creator and viewer metrics; design an experiment; measure cross-app substitution; and make a launch recommendation using primary, secondary, and guardrail metrics.
The primary metric might be IG_story_viewer_days or IG_story_creators among eligible users, but I would pair it with Family_time_spent, FB_stories_views, News_Feed_time, and quality guardrails such as hide_rate and report_rate. I would explicitly segment results by existing Instagram usage, Facebook overlap, age, region, and creator status, because launch value may come from either new supply or migration from another Meta surface. The key tradeoff I would flag is whether to optimize for app-local growth or ecosystem-level incrementality: a large Instagram lift is not enough if it mostly pulls time and creators from Facebook with worse quality outcomes. I would also discuss novelty effects by looking at repeated creation and viewing over multiple weeks, not just initial adoption. If I had more time, I would add creator-network effects, such as whether more friends posting stories increases viewer retention and reply behavior.
A second angle
For “Explain why IG Story usage exceeds Facebook”, the framing shifts from launch evaluation to diagnosis. I would avoid saying “younger users like Instagram more” as a final answer and instead break the hypothesis space into audience composition, product surface prominence, graph structure, creator incentives, media norms, and notification/distribution differences. The analysis would compare normalized rates among matched cohorts, such as users active on both apps, controlling for age, geography, tenure, and friend/follower count. Because this is explanatory, I might combine descriptive decomposition with causal tests: for example, test whether increasing FB Stories entry-point prominence raises creation or only shifts clicks. The same cross-app logic applies, but the output is a ranked set of validated drivers rather than a launch/no-launch decision.
Common pitfalls
Pitfall: Treating raw
viewsortime_spentas the answer.
Raw totals confound audience size, app activity, placement, and demographics. A stronger answer normalizes by eligible DAU, separates creators from viewers, and reports ecosystem-level deltas so the interviewer can see whether the difference is true product strength or just distribution.
Pitfall: Ignoring interference and cannibalization.
A tempting answer is “run an A/B test on Instagram users and measure IG Stories lift.” That misses the main Meta-specific issue: treated users may reduce FB Stories, News Feed, or messaging. Better is to randomize at the person/account level and include family-level metrics and cross-app guardrails.
Pitfall: Over-explaining product intuition without statistical discipline.
Saying “Instagram is more visual, so Stories should perform better” may be directionally reasonable but is not enough for a DS interview. Land better by translating intuition into testable hypotheses, measurable metrics, segmentation plans, and causal designs with stated assumptions.
Connections
Expect pivots into experimentation under network effects, causal inference with observational data, metric design for two-sided ecosystems, and ranking/recommender evaluation for story trays. If the interviewer pushes on causal validity, be ready to discuss difference-in-differences assumptions, spillover bias, CUPED, multiple testing, and practical versus statistical significance.
Further reading
-
Trustworthy Online Controlled Experiments — Kohavi, Tang, and Xu’s practical guide to experiment design, guardrails, power, and launch decisions.
-
CUPED: Controlled-experiment Using Pre-Experiment Data — Seminal paper on variance reduction for online experiments.
-
Causal Inference for Statistics, Social, and Biomedical Sciences — Imbens and Rubin’s foundation for potential outcomes, selection bias, and causal estimands.
Practice questions
- Compare Instagram vs. Facebook using causal experimentsMeta · Data Scientist · Onsite · Medium
- Evaluate Facebook Dating launch and validate successMeta · Data Scientist · Technical Screen · hard
- Explain why IG Story usage exceeds FacebookMeta · Data Scientist · Onsite · easy
- Design Machine Learning Model for Facebook Groups Post RankingMeta · Data Scientist · Onsite · hard
- Investigate Causes of Decline in Facebook Group CommentsMeta · Data Scientist · Onsite · medium
- Estimate Instagram Shopping Feature's Revenue and Test ImpactMeta · Data Scientist · Onsite · hard
- Evaluate Instagram's Short-Video Recommender System SuccessMeta · Data Scientist · Onsite · medium
- Evaluating and launching Instagram StoriesMeta · Data Scientist · Onsite · medium
Related concepts
- Facebook Product AnalyticsAnalytics & Experimentation
- Instagram Product AnalyticsAnalytics & Experimentation
- Shop Ads And Social Commerce Analytics
- Feed And News Feed AnalyticsAnalytics & Experimentation
- Cohort, Retention, Funnel And Product MetricsAnalytics & Experimentation
- Group Calls And Messaging Analytics