PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Data Manipulation (SQL/Python)/Wayfair

Solve SQL and Python OA tasks

Last updated: Mar 29, 2026

Quick Overview

This question evaluates SQL and pandas data-manipulation skills, including joined aggregations over date windows, handling of missing values with median imputation, and multi-criteria sorting and tie-breaking.

  • easy
  • Wayfair
  • Data Manipulation (SQL/Python)
  • Data Scientist

Solve SQL and Python OA tasks

Company: Wayfair

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: easy

Interview Round: Take-home Project

Complete the following two online-assessment tasks. ### Task A — SQL You are given two tables: `customer` - `customer_id` INT - `customer_name` VARCHAR - `purchase_id` VARCHAR `purchase` - `id` VARCHAR - `purchase_price` DECIMAL(10,2) - `purchase_date` DATE `customer.purchase_id` joins to `purchase.id`. Based on the sample in the original prompt, interpret **"the earliest 10 years"** as the historical 10-year window that starts on the minimum `purchase_date` in the `purchase` table and ends just before `MIN(purchase_date) + INTERVAL '10 years'`. Assume `purchase_date` is stored as a `DATE` in UTC. Write a SQL query to return the customer(s) whose `purchase_price` is the highest within that earliest 10-year window. If multiple customers tie for the highest qualifying price, return all of them. Required output columns: - `customer_name` - `highest_purchase_price` ### Task B — Python You are given a pandas DataFrame `students` with schema: - `student_id` INT - `math_score` FLOAT NULL - `english_score` FLOAT NULL - `physics_score` FLOAT NULL Write a Python function that returns the top students after the following processing steps: 1. Remove any student who is missing scores in at least two subjects. 2. For the remaining students, fill each missing subject score with the median of that subject, computed over the remaining non-null rows. 3. Sort the cleaned table by `math_score` descending, then `physics_score` descending. If additional tie-breaking is needed, sort by `student_id` ascending. 4. Return the top 5 rows. If fewer than 5 students remain, return all remaining rows. Return the full cleaned rows for the selected students.

Quick Answer: This question evaluates SQL and pandas data-manipulation skills, including joined aggregations over date windows, handling of missing values with median imputation, and multi-criteria sorting and tie-breaking.

Related Interview Questions

  • Find top buyer in earliest 10-year window - Wayfair (easy)
  • Clean scores and return top 5 students - Wayfair (easy)
Wayfair logo
Wayfair
Feb 16, 2026, 12:00 AM
Data Scientist
Take-home Project
Data Manipulation (SQL/Python)
1
0

Complete the following two online-assessment tasks.

Task A — SQL

You are given two tables:

customer

  • customer_id INT
  • customer_name VARCHAR
  • purchase_id VARCHAR

purchase

  • id VARCHAR
  • purchase_price DECIMAL(10,2)
  • purchase_date DATE

customer.purchase_id joins to purchase.id.

Based on the sample in the original prompt, interpret "the earliest 10 years" as the historical 10-year window that starts on the minimum purchase_date in the purchase table and ends just before MIN(purchase_date) + INTERVAL '10 years'. Assume purchase_date is stored as a DATE in UTC.

Write a SQL query to return the customer(s) whose purchase_price is the highest within that earliest 10-year window. If multiple customers tie for the highest qualifying price, return all of them.

Required output columns:

  • customer_name
  • highest_purchase_price

Task B — Python

You are given a pandas DataFrame students with schema:

  • student_id INT
  • math_score FLOAT NULL
  • english_score FLOAT NULL
  • physics_score FLOAT NULL

Write a Python function that returns the top students after the following processing steps:

  1. Remove any student who is missing scores in at least two subjects.
  2. For the remaining students, fill each missing subject score with the median of that subject, computed over the remaining non-null rows.
  3. Sort the cleaned table by math_score descending, then physics_score descending. If additional tie-breaking is needed, sort by student_id ascending.
  4. Return the top 5 rows. If fewer than 5 students remain, return all remaining rows.

Return the full cleaned rows for the selected students.

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Wayfair•More Data Scientist•Wayfair Data Scientist•Wayfair Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.