PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Statistics & Math/Snapchat

Explain median vs mean for L1/L2

Last updated: Jun 24, 2026

Quick Overview

This question evaluates understanding of central tendency and loss functions—specifically the roles of median and mean under L1 (absolute) and L2 (squared) norms—and the geometric extension of these concepts to two-dimensional data.

  • medium
  • Snapchat
  • Statistics & Math
  • Software Engineer

Explain median vs mean for L1/L2

Company: Snapchat

Role: Software Engineer

Category: Statistics & Math

Difficulty: medium

Interview Round: Technical Screen

Explain, with intuition and a brief derivation, why the median minimizes the sum of absolute deviations (L 1) while the mean minimizes the sum of squared deviations (L 2). How does this extend to selecting an optimal point in two dimensions under Manhattan versus Euclidean distance, and when is the average a poor choice due to outliers?

Quick Answer: This question evaluates understanding of central tendency and loss functions—specifically the roles of median and mean under L1 (absolute) and L2 (squared) norms—and the geometric extension of these concepts to two-dimensional data.

Related Interview Questions

  • Compute posterior spam risk from flags - Snapchat (Medium)
  • Derive logistic regression and thresholds - Snapchat (hard)
  • Compute expectations and test fairness for coin flips - Snapchat (easy)
  • How to Update Bayesian Model for Concept Drift? - Snapchat (medium)
  • Calculate Posterior Probability Using Bayes' Theorem Example - Snapchat (easy)
|Home/Statistics & Math/Snapchat

Explain median vs mean for L1/L2

Snapchat logo
Snapchat
Sep 6, 2025, 12:00 AM
mediumSoftware EngineerTechnical ScreenStatistics & Math
14
0

Median vs. Mean Under L1 and L2 Loss, and the 2D Extension

Explain, with intuition and a brief derivation, the relationship between the choice of loss function and the optimal point estimate.

Specifically, address all of the following:

  1. 1D. Why does the median minimize the sum of absolute deviations ( L1L_1L1​ ), while the mean minimizes the sum of squared deviations ( L2L_2L2​ )?
  2. 2D. When you minimize total Manhattan distance to a set of 2D points versus total Euclidean distance, which estimator do you get in each case? (Be explicit about whether "Euclidean" means squared or unsquared distance — they differ.)
  3. Robustness. When is the average (mean) a poor choice, and what makes the median robust where the mean is not?

You should give the key optimality conditions / derivations (not just the answers), the geometric intuition for why each estimator falls out, and a short note on how you'd actually compute the median — including the optimal data structure and time/space complexity for the static and streaming cases.

Constraints & Assumptions

  • 1D data: x1,x2,…,xn∈Rx_1, x_2, \dots, x_n \in \mathbb{R}x1​,x2​,…,xn​∈R .
  • 2D points: pi=(xi,yi)∈R2p_i = (x_i, y_i) \in \mathbb{R}^2pi​=(xi​,yi​)∈R2 .
  • " L1L_1L1​ " = minimize the sum of absolute deviations; " L2L_2L2​ " = minimize the sum of squared deviations.
  • For 2D Euclidean distance, treat both the squared and unsquared forms — they yield different optima.
  • Assume all points are weighted equally unless you choose to discuss the weighted generalization.

Clarifying Questions to Ask

  • Is "Euclidean L2" the squared distance ∥pi−c∥22\lVert p_i - c\rVert_2^2∥pi​−c∥22​ or the unsquared distance ∥pi−c∥2\lVert p_i - c\rVert_2∥pi​−c∥2​ ? The answer changes the 2D optimum.
  • Is a closed-form / exact answer required, or is an iterative algorithm acceptable for the case that has no closed form?
  • For the computational part, is the data static (one-shot computation) or arriving as a stream (running median)? Bounded-memory / approximate answers OK?
  • Should the estimate be a single point, or is the full set of minimizers expected when it is not unique (e.g. even nnn )?

What a Strong Answer Covers

  • Correct optimality conditions, derived. J′(μ)=0⇒∑i(xi−μ)=0J'(\mu)=0 \Rightarrow \sum_i(x_i-\mu)=0J′(μ)=0⇒∑i​(xi​−μ)=0 for the mean; the subgradient/sign-count condition for the median; states both objectives are convex so the stationary point is global.
  • Intuition for both estimators. Mean = balance of linear restoring forces (center of mass); median = balance of counts on each side, independent of magnitudes.
  • The 2D separability argument. Recognizes that Manhattan and squared-Euclidean separate by coordinate (→ coordinate-wise median and centroid respectively), while unsquared Euclidean does not separate (→ geometric median, no closed form, solved by Weiszfeld's algorithm).
  • Robustness framed quantitatively. Mean breakdown point 0%0\%0% vs. median 50%50\%50% ; ties this to outliers, skew, and (optionally) multimodality; names robust alternatives.
  • Non-uniqueness handled. Even- nnn 1D median is an interval; the 2D Manhattan optimum is the product of the two coordinate intervals.
  • Computation. Mean in O(n)O(n)O(n) time / O(1)O(1)O(1) space; median via Quickselect ( O(n)O(n)O(n) average) or sort ( O(nlog⁡n)O(n\log n)O(nlogn) ); streaming median via two heaps ( O(log⁡n)O(\log n)O(logn) insert, O(1)O(1)O(1) query); correct complexity claims.

Follow-up Questions

  • Maintain a sliding-window median of the last kkk elements of a stream. What data structure gives O(log⁡k)O(\log k)O(logk) per slide, and what's the trick for deletion from a heap?
  • The mean is the maximum-likelihood estimate under Gaussian noise. Under what noise model is the median the MLE, and how does that connect to L1L_1L1​ loss?
  • Generalize to weighted data: what do the 1D minimizers become, and does the 2D Manhattan answer stay coordinate-wise separable?
  • For the geometric median, Weiszfeld's iteration divides by ∥pi−c(t)∥\lVert p_i - c^{(t)}\rVert∥pi​−c(t)∥ . What goes wrong when an iterate lands exactly on a data point, and how is it patched?
Loading comments...

Browse More Questions

More Statistics & Math•More Snapchat•More Software Engineer•Snapchat Software Engineer•Snapchat Statistics & Math•Software Engineer Statistics & Math

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.