PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/Two Sigma

Predict bike demand and avoid overfitting

Last updated: Mar 29, 2026

Quick Overview

This question evaluates time-series forecasting, feature engineering, awareness of data leakage risks, model evaluation choices, and overfitting prevention competencies within the Machine Learning domain applied to demand prediction.

  • hard
  • Two Sigma
  • Machine Learning
  • Data Scientist

Predict bike demand and avoid overfitting

Company: Two Sigma

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Technical Screen

You are given historical data for a city bike-sharing system. Available fields include `station_id`, hourly timestamp, number of bike pickups and returns, dock capacity, current bikes available at prediction time, weather, holidays, and nearby transit or event signals. Design a model to predict the number of bike pickups from a specific dock during the next hour. Discuss: - how you would define the target and avoid data leakage; - what features you would engineer from temporal patterns, station behavior, weather, and geography; - what train/validation/test strategy you would use for this time-dependent problem; - which evaluation metric(s) you would choose (for example, MAE, RMSE, Poisson deviance, or a downstream empty/full-dock metric) and the trade-offs; - how you would detect and prevent overfitting.

Quick Answer: This question evaluates time-series forecasting, feature engineering, awareness of data leakage risks, model evaluation choices, and overfitting prevention competencies within the Machine Learning domain applied to demand prediction.

Related Interview Questions

  • Analyze Temperatures and Update Regression - Two Sigma (medium)
  • How would you forecast bike demand? - Two Sigma (hard)
  • Predict Bike Dock Demand - Two Sigma (hard)
  • How detect duplicate card records? - Two Sigma (medium)
  • How to forecast bike dock demand - Two Sigma (easy)
Two Sigma logo
Two Sigma
Mar 13, 2026, 12:00 AM
Data Scientist
Technical Screen
Machine Learning
4
0

You are given historical data for a city bike-sharing system. Available fields include station_id, hourly timestamp, number of bike pickups and returns, dock capacity, current bikes available at prediction time, weather, holidays, and nearby transit or event signals.

Design a model to predict the number of bike pickups from a specific dock during the next hour.

Discuss:

  • how you would define the target and avoid data leakage;
  • what features you would engineer from temporal patterns, station behavior, weather, and geography;
  • what train/validation/test strategy you would use for this time-dependent problem;
  • which evaluation metric(s) you would choose (for example, MAE, RMSE, Poisson deviance, or a downstream empty/full-dock metric) and the trade-offs;
  • how you would detect and prevent overfitting.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Two Sigma•More Data Scientist•Two Sigma Data Scientist•Two Sigma Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.