Aggregate exam scores with NULL handling
Company: Capital One
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Take-home Project
Write a single SQL query (or CTE pipeline) to satisfy all requirements using the schema and sample data below. Replace any vendor-specific function with ANSI SQL where possible, but you may use ISNULL or COALESCE explicitly when needed. Schema and sample rows: students(student_id, name, class_id) +------------+-------+----------+ | student_id | name | class_id | | 1 | Alice | 101 | | 2 | Bob | 101 | | 3 | Carol | 102 | | 4 | Dave | 102 | +------------+-------+----------+ classes(class_id, class_name) +----------+------------+ | class_id | class_name | | 101 | Math-A | | 102 | Math-B | +----------+------------+ exams(exam_id, exam_name, exam_date) +---------+-----------+------------+ | exam_id | exam_name | exam_date | | 11 | Midterm | 2024-03-10 | | 12 | Final | 2024-06-15 | +---------+-----------+------------+ scores(student_id, exam_id, score) +------------+---------+-------+ | student_id | exam_id | score | | 1 | 11 | 92 | | 1 | 12 | 88 | | 2 | 11 | 58 | | 2 | 12 | 61 | | 3 | 11 | NULL | | 3 | 12 | 73 | | 4 | 11 | 45 | | 4 | 12 | 52 | +------------+---------+-------+ Requirements: (1) Include only exams in calendar year 2024 (use WHERE). (2) Create adj_score = NULL when score < 60, otherwise score; then in the final select expose adj_score_filled = ISNULL(adj_score, 0). (3) Compute, per class_name, pass_count = count of student-exam rows with score >= 85, fail_count = count with score < 60, honors_students = count of distinct students in the class with any score >= 90 across included exams. (4) Also return avg_adj_score_zero_filled = AVG(ISNULL(adj_score, 0)) per class_name and exam_count = count of distinct exams per class present in the joined data. (5) Students with no score records for included exams must still appear with zero counts (use LEFT JOIN from students). (6) Order results by class_name ascending. Provide the final query only.
Quick Answer: This question evaluates proficiency in SQL data manipulation, specifically joins (including LEFT JOIN behavior), NULL handling with ISNULL/COALESCE, conditional aggregation, distinct counts, and date-based filtering within the Data Manipulation (SQL/Python) domain, focusing on practical application of query-writing skills.