PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches

Quick Overview

This question evaluates proficiency in data cleaning (type conversion), missing-data handling, group-period aggregation, and estimating treatment effects via difference-in-differences.

  • easy
  • Uber
  • Data Manipulation (SQL/Python)
  • Data Scientist

Transform DataFrame and compute diff-in-diff

Company: Uber

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: easy

Interview Round: Technical Screen

You are given a pandas DataFrame `df` with the following columns: - `unit_id` (string): entity identifier (e.g., user, city, driver) - `group` (string): either `'treatment'` or `'control'` - `period` (string): either `'pre'` or `'post'` - `y` (string): outcome stored as a string (should be numeric), with **exactly one missing value** (NaN) Tasks: 1. Convert `y` from string to integer (assume all non-missing values are valid integer strings, e.g. `'12'`). 2. Impute the missing value in `y` using the **simple (unconditional) average** of the non-missing `y` values. 3. After steps (1)–(2), compute the **difference-in-differences (DiD)** estimate of the treatment effect on `y`: \[ \text{DiD} = (\overline{y}_{\text{treat, post}} - \overline{y}_{\text{treat, pre}}) - (\overline{y}_{\text{ctrl, post}} - \overline{y}_{\text{ctrl, pre}}) \] Return the scalar DiD estimate (and optionally the intermediate group-period means used).

Quick Answer: This question evaluates proficiency in data cleaning (type conversion), missing-data handling, group-period aggregation, and estimating treatment effects via difference-in-differences.

Last updated: May 7, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Write SQL for active counts and YTD top driver - Uber (Medium)
  • Write SQL and Pandas for Uber Trips - Uber (Medium)
  • Compute ETA shift and conversion uplift - Uber (Medium)
  • Write SQL/Python for CTR analytics - Uber (Medium)
  • Clean, split, merge, and aggregate with pandas - Uber (Medium)