Design student–course data models and SQL

Q: How do I practice SQL interview questions?

PracHub provides an interactive SQL console where you can write and test queries against real database schemas. Get instant feedback and compare your solution with the expected output.

Q: What difficulty level is this coding question?

This is a Medium difficulty Data Manipulation (SQL/Python) question, commonly asked during Onsite rounds at Amazon.

Q: What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Amazon during technical interviews.

Question

Scenario: Model a university domain with Students, Courses, Departments, Instructors, and Enrollments.
Tasks:
1) OLTP ERD: Specify normalized tables with PRIMARY/FOREIGN keys and constraints (e.g., a student cannot enroll in two sections of the same course in the same term; grade in {A,B,C,D,F,Pass,Fail,Incomplete}). Include indexes you would create and why.
2) Warehouse: Design a star schema for analytics (FactEnrollment with measures like credits_attempted, credits_earned, points; DimStudent, DimCourse, DimDepartment, DimTerm, DimInstructor). Explain your grain and surrogate keys. Model SCD Type 2 on DimStudent for major changes; show columns (effective_start, effective_end, is_current) and how you would join for point-in-time correctness.
3) Partitioning: Propose partitioning/clustering for FactEnrollment at 100M+ rows. Justify by common query predicates and maintenance.
4) Write SQL:
 a) List students who took both 'CS101' and 'MATH201' in the same term.
 b) For the most recent completed term, return the top 3 departments by average GPA (weighted by credits). Break ties deterministically.
5) Briefly compare OLTP vs OLAP design trade-offs in this scenario and when you would prefer snowflake over star.

PracHub · Accepted Answer

This question evaluates data modeling, database design, and SQL proficiency—covering normalization, schema constraints, indexing, star-schema warehousing, slowly changing dimensions, partitioning strategies, and complex analytic queries—in the domain of relational databases and data warehousing (Data Manipulation/SQL).

Quick Overview

Quick Overview