This question evaluates proficiency in streaming/out-of-core linear regression, including computing sufficient statistics with an intercept, assessing numerical stability of normal equations versus QR/SVD or incremental methods, incorporating ridge penalties, and designing parallel fault-tolerant computations.
You need to estimate linear regression coefficients when the dataset is too large to fit in memory. Assume we can read data in mini-batches of rows. Let X ∈ R^{n×p} be the feature matrix and y ∈ R^{n} the target. Include an intercept.
Login required