Calculate cross-channel login user proportions
Company: Amazon
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Onsite
Write SQL to compute, for 2025-08-29 through 2025-08-31, the proportion of users who logged in only via mobile, only via desktop, and via both, where the denominator is distinct users who used at least one channel in the window. Tables and sample data:
Schemas:
mobile_logins(user_id INT, login_dt DATE)
desktop_logins(user_id INT, login_dt DATE)
Sample rows:
mobile_logins
user_id | login_dt
1 | 2025-08-30
1 | 2025-08-31
2 | 2025-08-31
3 | 2025-08-30
desktop_logins
user_id | login_dt
1 | 2025-08-31
4 | 2025-08-31
3 | 2025-08-29
3 | 2025-08-31
Requirements: (a) deduplicate users within each channel; (b) avoid double-counting users in the denominator; (c) return counts and percentages with two decimals for each segment {only_mobile, only_desktop, both}; (d) handle very large tables efficiently (no full cross joins), and be portable to ANSI SQL. Explain corner cases you handled (users appearing in neither channel; users with multiple same-day logins; users whose activity spans outside the window).
Quick Answer: This question evaluates data manipulation (SQL/Python) skills, focusing on cross-channel user attribution, deduplication, distinct user counting, date-window filtering, join logic, efficient aggregation for large tables, and formatted output of counts and percentages in ANSI SQL-compatible queries.