PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/Two Sigma

Analyze Temperatures and Update Regression

Last updated: May 11, 2026

Quick Overview

This question evaluates proficiency in time-series and statistical analysis (defining volatility and similarity metrics), supervised regression modeling with mean squared error evaluation, greedy feature selection, and streaming/no-intercept linear regression updates.

  • medium
  • Two Sigma
  • Machine Learning
  • Data Scientist

Analyze Temperatures and Update Regression

Company: Two Sigma

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Take-home Project

You are given historical daily temperature data for New York City and several nearby towns. Each row contains a date, the NYC temperature, and the temperature for each town on that date. Answer the following: 1. **Volatility analysis** - Determine which town has the largest temperature fluctuation over time. - Clearly define the metric you use for fluctuation. 2. **Similarity analysis** - Determine which town's temperature pattern is most similar to NYC's. - Clearly define the similarity metric you use. 3. **Prediction task** - Use the towns' temperature data to predict NYC temperature. - Train a regression model and evaluate it using mean squared error, or MSE. 4. **Greedy feature selection** - Given a target number `k`, choose `k` towns to use as features. - Start with no selected towns. - At each step, add the town that gives the largest reduction in validation MSE when combined with the already selected towns. - Return the selected towns and the final MSE. 5. **No-intercept linear regression** - Implement simple linear regression without an intercept for a single predictor `x` and target `y`. - First solve the batch case, where all data is available at once. - Then solve the streaming case, where `(x, y)` pairs arrive one at a time and the current slope must be updated without recomputing from scratch over all past data.

Quick Answer: This question evaluates proficiency in time-series and statistical analysis (defining volatility and similarity metrics), supervised regression modeling with mean squared error evaluation, greedy feature selection, and streaming/no-intercept linear regression updates.

Related Interview Questions

  • How would you forecast bike demand? - Two Sigma (hard)
  • Predict Bike Dock Demand - Two Sigma (hard)
  • Predict bike demand and avoid overfitting - Two Sigma (hard)
  • How detect duplicate card records? - Two Sigma (medium)
  • How to forecast bike dock demand - Two Sigma (easy)
Two Sigma logo
Two Sigma
Apr 21, 2026, 12:00 AM
Data Scientist
Take-home Project
Machine Learning
0
0

You are given historical daily temperature data for New York City and several nearby towns. Each row contains a date, the NYC temperature, and the temperature for each town on that date.

Answer the following:

  1. Volatility analysis
    • Determine which town has the largest temperature fluctuation over time.
    • Clearly define the metric you use for fluctuation.
  2. Similarity analysis
    • Determine which town's temperature pattern is most similar to NYC's.
    • Clearly define the similarity metric you use.
  3. Prediction task
    • Use the towns' temperature data to predict NYC temperature.
    • Train a regression model and evaluate it using mean squared error, or MSE.
  4. Greedy feature selection
    • Given a target number k , choose k towns to use as features.
    • Start with no selected towns.
    • At each step, add the town that gives the largest reduction in validation MSE when combined with the already selected towns.
    • Return the selected towns and the final MSE.
  5. No-intercept linear regression
    • Implement simple linear regression without an intercept for a single predictor x and target y .
    • First solve the batch case, where all data is available at once.
    • Then solve the streaming case, where (x, y) pairs arrive one at a time and the current slope must be updated without recomputing from scratch over all past data.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Two Sigma•More Data Scientist•Two Sigma Data Scientist•Two Sigma Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.