PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates proficiency in time-series feature engineering, leakage prevention, memory-efficient large-scale data processing, categorical target encoding, exponential weighting, and algorithmic complexity within the Data Manipulation (SQL/Python) domain, emphasizing practical application with pandas at scale.

  • Medium
  • Freddie Mac
  • Data Manipulation (SQL/Python)
  • Data Scientist

Compute leakage-safe rolling features in pandas

Company: Freddie Mac

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Onsite

Using pandas on a 50M-row monthly panel with columns [loan_id, msa, month, property_type, delinquent_90dpd (0/1), upb], create features for each loan-month: (a) a 12-month rolling delinquency rate per (msa, property_type) that excludes the current month (strict t-1 window), (b) a target-encoded property_type delinquency rate per MSA using only data strictly before the current month (leave-one-time-step-out), and (c) an exponentially weighted default intensity per loan with half-life = 6 months. Return a DataFrame with one row per loan-month containing these features, leakage-safe. Explain how you would: (1) ensure memory efficiency under 16 GB RAM (categorical dtypes, downcasting, chunked joins, parquet scans), (2) guarantee time-order correctness after shuffles (sort indices, stable groupby; avoid groupby.apply pitfalls), and (3) unit test correctness with a small deterministic example covering edge cases (missing months, single-observation groups). Provide the big-O time/memory tradeoffs of your approach.

Quick Answer: This question evaluates proficiency in time-series feature engineering, leakage prevention, memory-efficient large-scale data processing, categorical target encoding, exponential weighting, and algorithmic complexity within the Data Manipulation (SQL/Python) domain, emphasizing practical application with pandas at scale.

Last updated: Mar 29, 2026

Related Coding Questions

  • Design SQL cleaning, mapping, dedupe, and keying - Freddie Mac (Medium)
  • Assess SQL cleaning, mapping, joins, keys, and DDL/DML - Freddie Mac (Medium)

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.