This question evaluates troubleshooting and analytical skills in SQL and data engineering, testing competence in data manipulation, statistical comparison of proportions, and incident localization across client, storage, ETL, and dashboard layers within the Data Manipulation (SQL/Python) domain.
On 2025‑09‑01 ('today'), the dashboard shows image uploads for Internet Explorer (IE) dropped to zero. Write SQL to (a) determine if this is a technical issue vs. behavior change, and (b) localize the fault (client, storage, ETL, or dashboard query). Use the schema below and produce: 1) A 8‑day trend by browser with significance-ready confidence intervals (approximate via normal or Wilson on proportions). 2) A client→storage funnel by browser for 2025‑09‑01 and the 7 days prior. 3) Top IE error codes and their first‑seen timestamps on 2025‑09‑01 vs. baseline. 4) A validation that the dashboard aggregation matches a direct recompute query; if they differ, show a minimal reproducible case. Finally, explain how your findings distinguish user‑behavior changes from instrumentation or storage bugs. Schema and tiny samples:
Table: events_uploads Columns: event_time TIMESTAMP, user_id BIGINT, browser STRING, os STRING, app_version STRING, country STRING, status STRING, error_code STRING, ip STRING, session_id STRING, event_id STRING Sample: +---------------------+---------+---------+---------+------------+---------+---------+------------+-----------+-----------+----------+ | event_time | user_id | browser | os | app_version| country | status | error_code | ip | session_id| event_id | +---------------------+---------+---------+---------+------------+---------+---------+------------+-----------+-----------+----------+ | 2025-08-31 22:10:00 | 101 | IE | Windows | 11.2.0 | AR | success | NULL | 1.1.1.1 | s1 | e1 | | 2025-09-01 09:00:00 | 102 | IE | Windows | 11.2.1 | US | fail | UPLOAD_415 | 2.2.2.2 | s2 | e2 | | 2025-09-01 09:05:00 | 103 | IE | Windows | 11.2.1 | US | fail | UPLOAD_415 | 2.2.2.3 | s3 | e3 | | 2025-09-01 09:07:00 | 104 | Chrome | Windows | 120.0 | US | success | NULL | 3.3.3.3 | s4 | e4 | | 2025-09-01 10:20:00 | 105 | Chrome | Mac | 120.0 | US | success | NULL | 4.4.4.4 | s5 | e5 | | 2025-09-01 11:00:00 | 106 | IE | Windows | 11.2.1 | US | fail | UPLOAD_415 | 2.2.2.4 | s6 | e6 | +---------------------+---------+---------+---------+------------+---------+---------+------------+-----------+-----------+----------+
Table: storage_uploads Columns: upload_time TIMESTAMP, user_id BIGINT, object_id STRING, source_event_id STRING, browser STRING Sample: +---------------------+---------+-----------+----------------+---------+ | upload_time | user_id | object_id | source_event_id| browser | +---------------------+---------+-----------+----------------+---------+ | 2025-08-31 22:10:05 | 101 | objA | e1 | IE | | 2025-09-01 09:07:10 | 104 | objB | e4 | Chrome | | 2025-09-01 10:20:12 | 105 | objC | e5 | Chrome | +---------------------+---------+-----------+----------------+---------+
Table: dash_metrics Columns: metric_date DATE, browser STRING, uploads_count BIGINT Sample: +-------------+---------+---------------+ | metric_date | browser | uploads_count | +-------------+---------+---------------+ | 2025-09-01 | IE | 0 | | 2025-09-01 | Chrome | 2 | +-------------+---------+---------------+
Include at least: (i) a 7‑day trailing comparison query, (ii) a client‑to‑storage left join to quantify drop‑off by browser and error_code, and (iii) a dashboard-recompute query. Explain how each result points to either a client bug, storage failure, ETL/typing bug (e.g., IE mislabeled), or a dashboard filter error.