Implement Sampling and Minimize Loss in Numerical Coding

Q: Implement Sampling and Minimize Loss in Numerical Coding

This question evaluates proficiency in probability and numerical methods, focusing on sampling from truncated distributions and analytical properties of estimators under various loss functions.

Q: How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

Q: What difficulty level is this coding question?

This is a Medium difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at Google.

Q: What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Google during technical interviews.

Question

##### Scenario Numerical coding challenges on sampling and loss minimization. ##### Question a) Implement functions to sample from truncated normal distributions for x>1, 44. b) For an array X, find the value minimizing Σ(x−θ)², then the value minimizing Σ|x−θ|, and derive the loss that yields the 90th percentile. ##### Hints Use rejection or CDF-inverse methods; derivatives show mean, median, and quantile solutions.

PracHub · Accepted Answer

from math import ceil

def minimize_loss(X: list, mode: str, tau: float = 0.9) -> float:
    if not isinstance(X, list) or len(X) == 0:
        raise ValueError("X must be a non-empty list")
    n = len(X)
    if mode not in ("L2", "L1", "quantile"):
        raise ValueError("mode must be one of 'L2', 'L1', 'quantile'")

if mode == "L2":
        return sum(X) / n

if mode == "quantile" and not (0 < tau <= 1):
        raise ValueError("tau must be in (0, 1] for 'quantile' mode")

# Select k-th order statistic (0-indexed) using Quickselect.
    a = list(X)  # work on a copy to avoid mutating input

if mode == "L1":
        k = (n - 1) // 2  # lower median
    else:  # mode == 'quantile'
        k = int(ceil(tau * n) - 1)

def partition(arr, lo, hi, pivot_index):
        pv = arr[pivot_index]
        arr[pivot_index], arr[hi] = arr[hi], arr[pivot_index]
        store = lo
        for i in range(lo, hi):
            if arr[i] < pv:
                arr[store], arr[i] = arr[i], arr[store]
                store += 1
        arr[store], arr[hi] = arr[hi], arr[store]
        return store

lo, hi = 0, n - 1
    while True:
        if lo == hi:
            return a[lo]
        pivot_index = (lo + hi) // 2
        p = partition(a, lo, hi, pivot_index)
        if k == p:
            return a[p]
        elif k < p:
            hi = p - 1
        else:
            lo = p + 1

The minimizer of the squared loss Σ(x−θ)^2 is the arithmetic mean. For absolute loss Σ|x−θ|, any median minimizes the loss; choosing the lower median yields determinism. The τ-quantile is defined as the smallest value with at least τ fraction of observations not exceeding it (nearest-rank/left quantile), which corresponds to the element at index ceil(τ·n)−1 in the sorted order. We compute the required order statistic with Quickselect in expected linear time without sorting the entire array.

Quick Overview

Explanation

Hints

Quick Overview