PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates proficiency in data cleaning, transformation, aggregation, and deterministic ranking across R (tidyverse) and SQL, focusing on handling missing values, date arithmetic, aggregation and rounding of metrics, and top-N selection with tie-breaking.

  • Medium
  • Flatiron Health
  • Data Manipulation (SQL/Python)
  • Data Scientist

Compute churn metrics and rank top students

Company: Flatiron Health

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

You have two tasks. Part A (R): You receive a Customers CSV with columns: id (INT), signup_date (YYYY-MM-DD), last_active_date (YYYY-MM-DD), churned (0/1/NULL), pay (DECIMAL/NULL). Clean and compute metrics under these rules: (1) Drop rows where id is NULL/blank; (2) If pay is NULL, set pay = 0; (3) If churned is NULL but pay > 0, set churned = 0 (treat as active); (4) Define active_days = DATEDIFF(last_active_date, signup_date); if last_active_date < signup_date, set active_days = 0; (5) Treat last_active_date as the churn date when churned = 1. Produce: (a) the average active_days across all remaining customers; and (b) the average pay among customers with churned = 1. Return both rounded to 2 decimals and explain briefly how each rule is enforced in your code. Use idiomatic tidyverse R. Sample data (for format clarity): Customers +-----+-------------+------------------+---------+------+ | id | signup_date | last_active_date | churned | pay | +-----+-------------+------------------+---------+------+ | 1 | 2025-01-10 | 2025-03-05 | 1 | 99 | | 2 | 2025-02-01 | 2025-02-20 | NULL | 15 | | 3 | 2025-01-15 | 2025-01-18 | 0 | NULL | | NULL| 2025-02-10 | 2025-02-12 | 1 | 50 | | 4 | 2025-02-11 | 2025-02-05 | 1 | 25 | +-----+-------------+------------------+---------+------+ Part B (SQL, MySQL 8.0): Using the schema below, write ONE query that: (i) computes each student’s avg_score across all grades; (ii) excludes students with zero completed assignments; (iii) keeps only the top CEIL(N/2) students by avg_score where N is the number of students with at least one grade; break ties using student_id ASC; (iv) outputs student_id, name, avg_score (rounded to 1 decimal), and a dense_rank over the selected students ordered by avg_score DESC, student_id ASC; and (v) still works correctly as more students/grades are added. Schema: Teachers(id PK, name, classroom) Students(id PK, name, primary_teacher_id FK -> Teachers.id) Assignments(id PK, teacher_id FK -> Teachers.id) Grades(student_id FK -> Students.id, assignment_id FK -> Assignments.id, grade DECIMAL(4,1), PK(student_id, assignment_id)) Sample tables: Teachers +----+------------+----------+ | id | name | classroom| +----+------------+----------+ | 4 | Mr. Feeny | 301 | | 8 | Mr. Cooper | 260 | +----+------------+----------+ Students +----+---------+--------------------+ | id | name | primary_teacher_id | +----+---------+--------------------+ | 1 | Bobby | 4 | | 2 | Susie | 8 | | 3 | Deborah | 8 | | 9 | Bruce | NULL | +----+---------+--------------------+ Assignments +----+------------+ | id | teacher_id | +----+------------+ | 1 | 4 | | 2 | 8 | +----+------------+ Grades +------------+---------------+-------+ | student_id | assignment_id | grade | +------------+---------------+-------+ | 1 | 1 | 100.0 | | 1 | 2 | 50.0 | | 2 | 1 | 100.0 | | 2 | 2 | 100.0 | | 3 | 1 | 40.0 | | 3 | 2 | 8.0 | | 9 | 1 | 65.0 | | 9 | 2 | 65.0 | +------------+---------------+-------+

Quick Answer: This question evaluates proficiency in data cleaning, transformation, aggregation, and deterministic ranking across R (tidyverse) and SQL, focusing on handling missing values, date arithmetic, aggregation and rounding of metrics, and top-N selection with tie-breaking.

Last updated: Mar 29, 2026

Related Coding Questions

  • Write SQL window functions for streaks - Flatiron Health (Medium)

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.