Median vs Mean Under L1 and L2, and the 2D Extension
Task
Explain, with intuition and a brief derivation, why:
-
In 1D, the median minimizes the sum of absolute deviations (L1), while the mean minimizes the sum of squared deviations (L2).
-
In 2D, the optimal point under Manhattan distance (L1) versus Euclidean distance (L2) corresponds to which estimator.
-
When the average (mean) is a poor choice due to outliers.
Assumptions/Setup
-
1D data: x₁, x₂, …, xₙ ∈ ℝ.
-
2D points: pᵢ = (xᵢ, yᵢ) ∈ ℝ².
-
“L1” refers to minimizing sum of absolute deviations; “L2” refers to minimizing sum of squared deviations. For 2D Euclidean distance, clarify whether the loss is squared or unsquared; both are addressed.