Improve low R² without p‑hacking

Q: Improve low R² without p‑hacking

This question evaluates competence in statistical modeling and causal inference, covering regression diagnostics, feature engineering and interactions, appropriate error distributions and link functions, leakage detection, model selection and validation, and the trade-off between predictive accuracy and valid effect estimation.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Predicting Contribution per Order with Low R²

Context

You are modeling contribution per order (a continuous per-order outcome such as margin or profit contribution) using a linear regression. The current model achieves R² = 0.07, indicating weak predictive performance. You care about both prediction accuracy and valid inference on key covariates (e.g., treatment effects, policy variables).

Tasks

(a) List concrete, practical steps to raise predictive performance without invalidating inference. Include:

Feature transformations (e.g., splines for basket size).
Interactions (e.g., treatment × daypart).
Appropriate error distribution/link (e.g., Gamma with log link) and when to use them.
Systematic leakage checks.

(b) Will simply adding another covariate reliably increase R² out-of-sample? Use cross-validation (CV) to demonstrate why or why not, and propose alternatives (GAMs, quantile regression, gradient boosting) that balance predictive performance with effect-estimation goals.

(c) Show how to use nested cross-validation and target-leakage tests to guard against p-hacking while iterating on features/hyperparameters.

(d) Explain when a low R² is acceptable for an unbiased average treatment effect (ATE) but unacceptable for accurate individual predictions.

Improve low R² without p‑hacking

Predicting Contribution per Order with Low R²

Context

Tasks

Solution

Comments (0)

Improve low R² without p‑hacking

Overview

Predicting Contribution per Order with Low R²

Context

Tasks

Solution

Comments (0)