PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Machine Learning/Meta

Build a model to infer home vs office vs public

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's applied machine learning competencies including privacy-preserving feature engineering, weak‑supervision labeling, model selection and calibration, uncertainty quantification, and operational deployment for inferring session network context (home vs office vs public) from telemetry.

  • Medium
  • Meta
  • Machine Learning
  • Data Scientist

Build a model to infer home vs office vs public

Company: Meta

Role: Data Scientist

Category: Machine Learning

Difficulty: Medium

Interview Round: Technical Screen

You must infer whether a Facebook session’s network context is home, office, or public venue to inform Portal targeting. Constraints: IPs may be shared (NAT), dynamic, or CGNAT; households have multiple users; only privacy‑preserving telemetry is allowed (timestamps, coarse geolocation, ASN/ISP, device/app vs web, session lengths, concurrent sessions, contact‑graph features). Today is 2025-09-01. Build an ML approach: 1) Features: propose robust, leak‑free features capturing diurnal/weekly patterns, ISP/ASN type (residential vs enterprise vs mobile), IP stability, geolocation drift, concurrent user counts on the same IP, session inter‑arrival, device/browser/OS mix, reverse DNS hints, and calling‑graph closeness (e.g., kin vs coworker patterns). Explain how to handle apartments sharing a router and coffee‑shop Wi‑Fi. 2) Labels: design weak‑supervision strategies to obtain labels at scale (e.g., overnight dwell heuristics, business‑hours rules, known corporate ASNs, opted‑in seed users, store‑IP blacklists). Describe how you will de‑bias noisy labels. 3) Modeling: compare baseline rule lists vs gradient‑boosted trees vs sequence models (e.g., per‑IP HMM or transformer over events). Consider multi‑instance learning to aggregate session‑level predictions to user/household. Explain calibration and thresholding for asymmetric costs (misclassifying office as home). 4) Evaluation: define metrics (macro F1, expected cost), cross‑geo temporal CV, and backtests across holidays. Prevent leakage from future behavior and from using Portal adoption as a proxy. Quantify uncertainty. 5) Privacy/compliance: specify minimization, aggregation, retention, on‑device inference options, and red‑teaming for re‑identification risks. 6) Deployment: outline real‑time vs batch inference, drift monitoring, and a holdout plan to measure whether location‑type targeting improves conversion.

Quick Answer: This question evaluates a data scientist's applied machine learning competencies including privacy-preserving feature engineering, weak‑supervision labeling, model selection and calibration, uncertainty quantification, and operational deployment for inferring session network context (home vs office vs public) from telemetry.

Related Interview Questions

  • Design and evaluate an ads ranking algorithm - Meta (easy)
  • How would you design a Shop Ads ranking algorithm? - Meta (easy)
  • Derive Linear Regression Solution - Meta (medium)
  • Explain key ML metrics and techniques - Meta (medium)
  • Design an ad recommendation ranking approach - Meta (easy)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Machine Learning
1
0

You must infer whether a Facebook session’s network context is home, office, or public venue to inform Portal targeting. Constraints: IPs may be shared (NAT), dynamic, or CGNAT; households have multiple users; only privacy‑preserving telemetry is allowed (timestamps, coarse geolocation, ASN/ISP, device/app vs web, session lengths, concurrent sessions, contact‑graph features). Today is 2025-09-01. Build an ML approach:

  1. Features: propose robust, leak‑free features capturing diurnal/weekly patterns, ISP/ASN type (residential vs enterprise vs mobile), IP stability, geolocation drift, concurrent user counts on the same IP, session inter‑arrival, device/browser/OS mix, reverse DNS hints, and calling‑graph closeness (e.g., kin vs coworker patterns). Explain how to handle apartments sharing a router and coffee‑shop Wi‑Fi.
  2. Labels: design weak‑supervision strategies to obtain labels at scale (e.g., overnight dwell heuristics, business‑hours rules, known corporate ASNs, opted‑in seed users, store‑IP blacklists). Describe how you will de‑bias noisy labels.
  3. Modeling: compare baseline rule lists vs gradient‑boosted trees vs sequence models (e.g., per‑IP HMM or transformer over events). Consider multi‑instance learning to aggregate session‑level predictions to user/household. Explain calibration and thresholding for asymmetric costs (misclassifying office as home).
  4. Evaluation: define metrics (macro F1, expected cost), cross‑geo temporal CV, and backtests across holidays. Prevent leakage from future behavior and from using Portal adoption as a proxy. Quantify uncertainty.
  5. Privacy/compliance: specify minimization, aggregation, retention, on‑device inference options, and red‑teaming for re‑identification risks.
  6. Deployment: outline real‑time vs batch inference, drift monitoring, and a holdout plan to measure whether location‑type targeting improves conversion.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Meta•More Data Scientist•Meta Data Scientist•Meta Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.