iOS vs. Android Usage Gap: Modeling, Causality, Telemetry, Missing Data, and Segmented Actions
Context
You observe that Instagram usage is substantially higher among iOS users than Android users. Assume "usage" refers to per-user weekly minutes spent in app (or a similar engagement proxy). Your task is to: model the drivers of usage, assess whether OS has a causal role, design telemetry comparisons across OS versions, handle missing data appropriately, and propose actions for discovered segments.
Tasks
-
Supervised learning framing
-
Define a clear prediction target for user-level usage.
-
List candidate features across: demographics, engagement, acquisition channel, and device/app telemetry.
-
Choose and justify a modeling approach (decision tree vs. random forest) for identifying discriminant variables.
-
Causality check: OS as cause vs. proxy
-
Explain how you would test whether OS is causal vs. a proxy for other factors.
-
Outline procedures for feature ablation, conditional permutation importance, and propensity stratification.
-
Telemetry collection and cross-OS comparison
-
Specify additional telemetry to collect (e.g., cold-start latency, crash rate, time-to-first-paint, battery drain).
-
Describe how to compare distributions across OS versions robustly.
-
Missing data handling
-
Defend the "-1 sentinel" approach vs. model-based imputation.
-
Explain when each is appropriate and any guardrails.
-
Segment actions and validation
-
Given two segments: (a) Argentinians underperforming, (b) Indians <30 overperforming, propose concrete product/marketing next steps.
-
Explain how to validate impact causally.