PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Two Sigma

Analyze Temperatures and Update Regression

Last updated: May 11, 2026

Quick Overview

This question evaluates proficiency in time-series and statistical analysis (defining volatility and similarity metrics), supervised regression modeling with mean squared error evaluation, greedy feature selection, and streaming/no-intercept linear regression updates.

  • medium
  • Two Sigma
  • Machine Learning
  • Data Scientist

Analyze Temperatures and Update Regression

Company: Two Sigma

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Take-home Project

You are given historical daily temperature data for New York City and several nearby towns. Each row contains a date, the NYC temperature, and the temperature for each town on that date. Answer the following: 1. **Volatility analysis** - Determine which town has the largest temperature fluctuation over time. - Clearly define the metric you use for fluctuation. 2. **Similarity analysis** - Determine which town's temperature pattern is most similar to NYC's. - Clearly define the similarity metric you use. 3. **Prediction task** - Use the towns' temperature data to predict NYC temperature. - Train a regression model and evaluate it using mean squared error, or MSE. 4. **Greedy feature selection** - Given a target number `k`, choose `k` towns to use as features. - Start with no selected towns. - At each step, add the town that gives the largest reduction in validation MSE when combined with the already selected towns. - Return the selected towns and the final MSE. 5. **No-intercept linear regression** - Implement simple linear regression without an intercept for a single predictor `x` and target `y`. - First solve the batch case, where all data is available at once. - Then solve the streaming case, where `(x, y)` pairs arrive one at a time and the current slope must be updated without recomputing from scratch over all past data.

Quick Answer: This question evaluates proficiency in time-series and statistical analysis (defining volatility and similarity metrics), supervised regression modeling with mean squared error evaluation, greedy feature selection, and streaming/no-intercept linear regression updates.

Related Interview Questions

  • How would you forecast bike demand? - Two Sigma (hard)
  • Predict Bike Dock Demand - Two Sigma (hard)
  • Predict bike demand and avoid overfitting - Two Sigma (hard)
  • How detect duplicate card records? - Two Sigma (medium)
  • How to forecast bike dock demand - Two Sigma (easy)
|Home/Machine Learning/Two Sigma

Analyze Temperatures and Update Regression

Two Sigma logo
Two Sigma
Apr 21, 2026, 12:00 AM
mediumData ScientistTake-home ProjectMachine Learning
5
0

You are given historical daily temperature data for New York City and several nearby towns. Each row contains a date, the NYC temperature, and the temperature for each town on that date.

Answer the following:

  1. Volatility analysis
    • Determine which town has the largest temperature fluctuation over time.
    • Clearly define the metric you use for fluctuation.
  2. Similarity analysis
    • Determine which town's temperature pattern is most similar to NYC's.
    • Clearly define the similarity metric you use.
  3. Prediction task
    • Use the towns' temperature data to predict NYC temperature.
    • Train a regression model and evaluate it using mean squared error, or MSE.
  4. Greedy feature selection
    • Given a target number k , choose k towns to use as features.
    • Start with no selected towns.
    • At each step, add the town that gives the largest reduction in validation MSE when combined with the already selected towns.
    • Return the selected towns and the final MSE.
  5. No-intercept linear regression
    • Implement simple linear regression without an intercept for a single predictor x and target y .
    • First solve the batch case, where all data is available at once.
    • Then solve the streaming case, where (x, y) pairs arrive one at a time and the current slope must be updated without recomputing from scratch over all past data.
Loading comments...

Browse More Questions

More Machine Learning•More Two Sigma•More Data Scientist•Two Sigma Data Scientist•Two Sigma Machine Learning•Data Scientist Machine Learning

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.