This question evaluates data manipulation and aggregation skills, specifically computing distinct user counts and revenue aggregates and reshaping results by platform; it is in the Data Manipulation (SQL/Python) domain and assesses practical application rather than purely conceptual understanding.
You are given transaction-level data and need daily aggregates by platform.
df with columns:
file_date
(DATE or string parseable to date) — the activity date
id
(INT/STRING) — user id
sku
(STRING) — product identifier
price
(NUMERIC) — revenue for the row (assume already in a single currency)
channel
(STRING)
customer_segment
(STRING)
platform
(STRING) — e.g.,
'web'
,
'mobile'
Example rows:
01-01-2024, 123, A, 57, seo, alpha, web
01-07-2024, 943, C, 100, tv, alpha, mobile
Compute, for each file_date and platform:
total_users
= number of
distinct
id
total_revenue
= sum of
price
Then reshape the result into a pivoted/wide daily report with one row per day and separate columns per platform:
file_date
web_total_users
,
web_total_revenue
mobile_total_users
,
mobile_total_revenue
(If additional platforms exist, include them similarly.)