PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates proficiency in probability and numerical methods, focusing on sampling from truncated distributions and analytical properties of estimators under various loss functions.

  • Medium
  • Google
  • Coding & Algorithms
  • Data Scientist

Implement Sampling and Minimize Loss in Numerical Coding

Company: Google

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

##### Scenario Numerical coding challenges on sampling and loss minimization. ##### Question a) Implement functions to sample from truncated normal distributions for x>1, 4<x<4.05, and x>4. b) For an array X, find the value minimizing Σ(x−θ)², then the value minimizing Σ|x−θ|, and derive the loss that yields the 90th percentile. ##### Hints Use rejection or CDF-inverse methods; derivatives show mean, median, and quantile solutions.

Quick Answer: This question evaluates proficiency in probability and numerical methods, focusing on sampling from truncated distributions and analytical properties of estimators under various loss functions.

Given an array X of n real numbers and a mode, return the value θ that minimizes a specified loss. Modes: (1) 'L2': minimize Σ(xi−θ)^2; return the arithmetic mean of X. (2) 'L1': minimize Σ|xi−θ|; return the smallest median (lower median when n is even). (3) 'quantile': return the τ-quantile, defined as the smallest θ such that at least τ fraction of elements are ≤ θ (nearest-rank/left quantile). If X is empty, raise ValueError. For mode 'quantile', τ must satisfy 0 < τ ≤ 1 (default τ = 0.9).

Constraints

  • 1 <= n <= 200000
  • |xi| <= 1e9
  • mode in {'L2','L1','quantile'}
  • For 'quantile' mode: 0 < tau <= 1
  • Return the smallest minimizer if multiple minimizers exist

Solution

from math import ceil

def minimize_loss(X: list, mode: str, tau: float = 0.9) -> float:
    if not isinstance(X, list) or len(X) == 0:
        raise ValueError("X must be a non-empty list")
    n = len(X)
    if mode not in ("L2", "L1", "quantile"):
        raise ValueError("mode must be one of 'L2', 'L1', 'quantile'")

    if mode == "L2":
        return sum(X) / n

    if mode == "quantile" and not (0 < tau <= 1):
        raise ValueError("tau must be in (0, 1] for 'quantile' mode")

    # Select k-th order statistic (0-indexed) using Quickselect.
    a = list(X)  # work on a copy to avoid mutating input

    if mode == "L1":
        k = (n - 1) // 2  # lower median
    else:  # mode == 'quantile'
        k = int(ceil(tau * n) - 1)

    def partition(arr, lo, hi, pivot_index):
        pv = arr[pivot_index]
        arr[pivot_index], arr[hi] = arr[hi], arr[pivot_index]
        store = lo
        for i in range(lo, hi):
            if arr[i] < pv:
                arr[store], arr[i] = arr[i], arr[store]
                store += 1
        arr[store], arr[hi] = arr[hi], arr[store]
        return store

    lo, hi = 0, n - 1
    while True:
        if lo == hi:
            return a[lo]
        pivot_index = (lo + hi) // 2
        p = partition(a, lo, hi, pivot_index)
        if k == p:
            return a[p]
        elif k < p:
            hi = p - 1
        else:
            lo = p + 1
Explanation
The minimizer of the squared loss Σ(x−θ)^2 is the arithmetic mean. For absolute loss Σ|x−θ|, any median minimizes the loss; choosing the lower median yields determinism. The τ-quantile is defined as the smallest value with at least τ fraction of observations not exceeding it (nearest-rank/left quantile), which corresponds to the element at index ceil(τ·n)−1 in the sorted order. We compute the required order statistic with Quickselect in expected linear time without sorting the entire array.

Time complexity: L2: O(n); L1 and quantile: O(n) average (Quickselect), O(n^2) worst-case. Space complexity: O(n) due to working copy; can be O(1) auxiliary if selecting in-place.

Hints

  1. For L2, set derivative of Σ(x−θ)^2 to zero to get θ = mean(X).
  2. For L1, any median minimizes the sum of absolute deviations; choose the lower median for determinism.
  3. For the τ-quantile, use k = ceil(τ*n) - 1 on the sorted order; compute k-th order statistic via Quickselect to avoid full sorting.
Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Solve Rooms and Top-K Streams - Google (medium)
  • Find Containing Range - Google (medium)
  • Rearrange Tasks With Cooldown - Google (medium)
  • Implement Employee Management and Expression Evaluation - Google (medium)
  • Solve Three Array and Matrix Path Problems - Google (medium)