You must infer whether a Facebook session’s network context is home, office, or public venue to inform Portal targeting. Constraints: IPs may be shared (NAT), dynamic, or CGNAT; households have multiple users; only privacy‑preserving telemetry is allowed (timestamps, coarse geolocation, ASN/ISP, device/app vs web, session lengths, concurrent sessions, contact‑graph features). Today is 2025-09-01. Build an ML approach: 1) Features: propose robust, leak‑free features capturing diurnal/weekly patterns, ISP/ASN type (residential vs enterprise vs mobile), IP stability, geolocation drift, concurrent user counts on the same IP, session inter‑arrival, device/browser/OS mix, reverse DNS hints, and calling‑graph closeness (e.g., kin vs coworker patterns). Explain how to handle apartments sharing a router and coffee‑shop Wi‑Fi. 2) Labels: design weak‑supervision strategies to obtain labels at scale (e.g., overnight dwell heuristics, business‑hours rules, known corporate ASNs, opted‑in seed users, store‑IP blacklists). Describe how you will de‑bias noisy labels. 3) Modeling: compare baseline rule lists vs gradient‑boosted trees vs sequence models (e.g., per‑IP HMM or transformer over events). Consider multi‑instance learning to aggregate session‑level predictions to user/household. Explain calibration and thresholding for asymmetric costs (misclassifying office as home). 4) Evaluation: define metrics (macro F1, expected cost), cross‑geo temporal CV, and backtests across holidays. Prevent leakage from future behavior and from using Portal adoption as a proxy. Quantify uncertainty. 5) Privacy/compliance: specify minimization, aggregation, retention, on‑device inference options, and red‑teaming for re‑identification risks. 6) Deployment: outline real‑time vs batch inference, drift monitoring, and a holdout plan to measure whether location‑type targeting improves conversion.

This question evaluates a data scientist's applied machine learning competencies including privacy-preserving feature engineering, weak‑supervision labeling, model selection and calibration, uncertainty quantification, and operational deployment for inferring session network context (home vs office vs public) from telemetry.

How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a medium difficulty Machine Learning question, commonly asked during Technical Screen rounds at Meta.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Meta during technical interviews.

Build a model to infer home vs office vs public

You must infer whether a Facebook session’s network context is home, office, or public venue to inform Portal targeting. Constraints: IPs may be shared (NAT), dynamic, or CGNAT; households have multiple users; only privacy‑preserving telemetry is allowed (timestamps, coarse geolocation, ASN/ISP, device/app vs web, session lengths, concurrent sessions, contact‑graph features). Today is 2025-09-01. Build an ML approach:

Features: propose robust, leak‑free features capturing diurnal/weekly patterns, ISP/ASN type (residential vs enterprise vs mobile), IP stability, geolocation drift, concurrent user counts on the same IP, session inter‑arrival, device/browser/OS mix, reverse DNS hints, and calling‑graph closeness (e.g., kin vs coworker patterns). Explain how to handle apartments sharing a router and coffee‑shop Wi‑Fi.
Labels: design weak‑supervision strategies to obtain labels at scale (e.g., overnight dwell heuristics, business‑hours rules, known corporate ASNs, opted‑in seed users, store‑IP blacklists). Describe how you will de‑bias noisy labels.
Modeling: compare baseline rule lists vs gradient‑boosted trees vs sequence models (e.g., per‑IP HMM or transformer over events). Consider multi‑instance learning to aggregate session‑level predictions to user/household. Explain calibration and thresholding for asymmetric costs (misclassifying office as home).
Evaluation: define metrics (macro F1, expected cost), cross‑geo temporal CV, and backtests across holidays. Prevent leakage from future behavior and from using Portal adoption as a proxy. Quantify uncertainty.
Privacy/compliance: specify minimization, aggregation, retention, on‑device inference options, and red‑teaming for re‑identification risks.
Deployment: outline real‑time vs batch inference, drift monitoring, and a holdout plan to measure whether location‑type targeting improves conversion.

Features: propose robust, leak‑free features capturing diurnal/weekly patterns, ISP/ASN type (residential vs enterprise vs mobile), IP stability, geolocation drift, concurrent user counts on the same IP, session inter‑arrival, device/browser/OS mix, reverse DNS hints, and calling‑graph closeness (e.g., kin vs coworker patterns). Explain how to handle apartments sharing a router and coffee‑shop Wi‑Fi.
Labels: design weak‑supervision strategies to obtain labels at scale (e.g., overnight dwell heuristics, business‑hours rules, known corporate ASNs, opted‑in seed users, store‑IP blacklists). Describe how you will de‑bias noisy labels.
Modeling: compare baseline rule lists vs gradient‑boosted trees vs sequence models (e.g., per‑IP HMM or transformer over events). Consider multi‑instance learning to aggregate session‑level predictions to user/household. Explain calibration and thresholding for asymmetric costs (misclassifying office as home).
Evaluation: define metrics (macro F1, expected cost), cross‑geo temporal CV, and backtests across holidays. Prevent leakage from future behavior and from using Portal adoption as a proxy. Quantify uncertainty.
Privacy/compliance: specify minimization, aggregation, retention, on‑device inference options, and red‑teaming for re‑identification risks.
Deployment: outline real‑time vs batch inference, drift monitoring, and a holdout plan to measure whether location‑type targeting improves conversion.

Build a model to infer home vs office vs public

Quick Overview

Build a model to infer home vs office vs public

Write your answer

Build a model to infer home vs office vs public

Quick Overview

Build a model to infer home vs office vs public

Write your answer