PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates a candidate's proficiency with data manipulation and simulation in R using dplyr, covering randomized sampling, vectorized transformations, left joins, grouped aggregation, and considerations for memory-safe processing at scale.

  • Medium
  • Google
  • Data Manipulation (SQL/Python)
  • Data Scientist

Implement R dplyr simulation and left join

Company: Google

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Using R and dplyr, run a simulation and a join. Data: prices item_id | price_usd 1 | 10.00 2 | 20.00 3 | 30.00 4 | 40.00 catalog item_id | category 1 | A 2 | B 3 | A 4 | C Tasks: - With set.seed(2025), perform 1,000 simulations. In each simulation: randomly select half of the rows in prices to keep the same price; increase the other half by 10%. Then left join to catalog and compute: (i) overall mean price, and (ii) mean price by category A/B/C. - Return a data frame with one row per simulation containing overall_mean and category means. Also return the empirical mean and SD across simulations for each statistic. - Constraints: use dplyr verbs (e.g., slice_sample, mutate, case_when, left_join, group_by, summarise). Avoid for-loops; use vectorized operations or map-style iteration while ensuring no accidental reuse of mutated state across iterations. Your solution must be memory-safe for 1e6 items (outline changes needed).

Quick Answer: This question evaluates a candidate's proficiency with data manipulation and simulation in R using dplyr, covering randomized sampling, vectorized transformations, left joins, grouped aggregation, and considerations for memory-safe processing at scale.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Generate binomial matrix and column-normalize - Google (Medium)
  • Analyze video flags and reviews with SQL - Google (Medium)
  • Write SQL/Python for messy event data - Google (Medium)
  • Add a conditional column in Python - Google (Medium)
  • Find most co‑purchased product pairs in SQL - Google (Medium)