Complete the following two online-assessment tasks.
Task A — SQL
You are given two tables:
customer
-
customer_id
INT
-
customer_name
VARCHAR
-
purchase_id
VARCHAR
purchase
-
id
VARCHAR
-
purchase_price
DECIMAL(10,2)
-
purchase_date
DATE
customer.purchase_id joins to purchase.id.
Based on the sample in the original prompt, interpret "the earliest 10 years" as the historical 10-year window that starts on the minimum purchase_date in the purchase table and ends just before MIN(purchase_date) + INTERVAL '10 years'. Assume purchase_date is stored as a DATE in UTC.
Write a SQL query to return the customer(s) whose purchase_price is the highest within that earliest 10-year window. If multiple customers tie for the highest qualifying price, return all of them.
Required output columns:
-
customer_name
-
highest_purchase_price
Task B — Python
You are given a pandas DataFrame students with schema:
-
student_id
INT
-
math_score
FLOAT NULL
-
english_score
FLOAT NULL
-
physics_score
FLOAT NULL
Write a Python function that returns the top students after the following processing steps:
-
Remove any student who is missing scores in at least two subjects.
-
For the remaining students, fill each missing subject score with the median of that subject, computed over the remaining non-null rows.
-
Sort the cleaned table by
math_score
descending, then
physics_score
descending. If additional tie-breaking is needed, sort by
student_id
ascending.
-
Return the top 5 rows. If fewer than 5 students remain, return all remaining rows.
Return the full cleaned rows for the selected students.