How do I approach Statistics & Math interview questions?

Statistics & Math questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master statistics & math interviews.

What difficulty level is this interview question?

This is a medium difficulty Statistics & Math question, commonly asked during Technical Screen rounds at WeRide.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at WeRide during technical interviews.

Test whether two distributions differ | WeRide Interview Question

Q: Test whether two distributions differ

Test whether two datasets come from the same distribution using univariate and multivariate methods, including KS, categorical tests, classifier two-sample tests, MMD, permutation tests, effect sizes, multiple-testing control, clustering-aware uncertainty, and sampling designs.

You have two datasets collected from different systems or populations, and you want to determine whether they come from the same distribution.

Explain how you would test distribution equality in univariate and multivariate settings. Also describe common sampling methods useful for collecting evaluation data.

Constraints & Assumptions

Define the null and alternative hypotheses.
Choose tests based on variable type and whether the question is about means, proportions, tails, or full distributions.
Discuss assumptions such as independence, clustering, and sample comparability.
Pair statistical significance with practical significance.
Include multiple-testing, power, and sampling-design considerations.

Clarifying Questions to Ask Guidance

What decision will this test support?
Are observations independent or clustered by user, vehicle, route, time, or city?
Are the two samples supposed to represent the same target population?
Are we comparing one feature or many?
Do tails or rare events matter more than averages?

Part 1 - Define Hypotheses And Run Univariate Tests

How would you test distribution equality for one variable?

What This Part Should Cover Guidance

H0: P = Q versus H1: P != Q .
Continuous tests such as KS, Anderson-Darling, Wasserstein or energy distance with permutation.
Categorical tests such as chi-squared or Fisher's exact test.
Mean-only tests such as t-test or Welch's t-test when the question is narrower.

Part 2 - Handle Multivariate Comparisons

How would you compare high-dimensional samples?

What This Part Should Cover Guidance

Classifier two-sample tests, MMD, energy distance, permutation tests, Hotelling's T-squared for mean vectors under stronger assumptions, and feature-level diagnostics.

Part 3 - Interpret Results And Practical Issues

How would you handle statistical significance, practical significance, sample size, power, multiple testing, and dependence?

What This Part Should Cover Guidance

Effect sizes, confidence intervals, business thresholds, tail metrics, Bonferroni or Benjamini-Hochberg, clustered bootstrap, and target-population comparability.

Part 4 - Sampling Methods

What sampling methods are useful for evaluation data?

What This Part Should Cover Guidance

Simple random, stratified, cluster, multistage, systematic, importance or weighted, reservoir, and convenience sampling.
When to prefer stratified or oversample-then-reweight designs.

What a Strong Answer Covers Guidance

Does not rely on a single universal test.
Matches the method to the variable type and decision.
Reports effect sizes and business relevance.
Understands that bad sampling can invalidate the comparison.

Follow-up Questions Guidance

Why can a large sample make tiny shifts significant?
How would you test rare safety-critical tails?
What if one dataset is rush-hour only and the other is all-day?
When would a classifier test be useful?
How would you reweight an oversampled evaluation set?

You have two datasets collected from different systems or populations, and you want to determine whether they come from the same distribution.

Explain how you would test distribution equality in univariate and multivariate settings. Also describe common sampling methods useful for collecting evaluation data.

Constraints & Assumptions

Define the null and alternative hypotheses.
Choose tests based on variable type and whether the question is about means, proportions, tails, or full distributions.
Discuss assumptions such as independence, clustering, and sample comparability.
Pair statistical significance with practical significance.
Include multiple-testing, power, and sampling-design considerations.

Clarifying Questions to Ask Guidance

What decision will this test support?
Are observations independent or clustered by user, vehicle, route, time, or city?
Are the two samples supposed to represent the same target population?
Are we comparing one feature or many?
Do tails or rare events matter more than averages?

Part 1 - Define Hypotheses And Run Univariate Tests

How would you test distribution equality for one variable?

What This Part Should Cover Guidance

H0: P = Q versus H1: P != Q .
Continuous tests such as KS, Anderson-Darling, Wasserstein or energy distance with permutation.
Categorical tests such as chi-squared or Fisher's exact test.
Mean-only tests such as t-test or Welch's t-test when the question is narrower.

Part 2 - Handle Multivariate Comparisons

How would you compare high-dimensional samples?

What This Part Should Cover Guidance

Classifier two-sample tests, MMD, energy distance, permutation tests, Hotelling's T-squared for mean vectors under stronger assumptions, and feature-level diagnostics.

Part 3 - Interpret Results And Practical Issues

How would you handle statistical significance, practical significance, sample size, power, multiple testing, and dependence?

What This Part Should Cover Guidance

Effect sizes, confidence intervals, business thresholds, tail metrics, Bonferroni or Benjamini-Hochberg, clustered bootstrap, and target-population comparability.

Part 4 - Sampling Methods

What sampling methods are useful for evaluation data?

What This Part Should Cover Guidance

Simple random, stratified, cluster, multistage, systematic, importance or weighted, reservoir, and convenience sampling.
When to prefer stratified or oversample-then-reweight designs.

What a Strong Answer Covers Guidance

Does not rely on a single universal test.
Matches the method to the variable type and decision.
Reports effect sizes and business relevance.
Understands that bad sampling can invalidate the comparison.

Follow-up Questions Guidance

Why can a large sample make tiny shifts significant?
How would you test rare safety-critical tails?
What if one dataset is rush-hour only and the other is all-day?
When would a classifier test be useful?
How would you reweight an oversampled evaluation set?

Test whether two distributions differ

Quick Overview

Test whether two distributions differ

Constraints & Assumptions

Clarifying Questions to Ask Guidance

Part 1 - Define Hypotheses And Run Univariate Tests

What This Part Should Cover Guidance

Part 2 - Handle Multivariate Comparisons

What This Part Should Cover Guidance

Part 3 - Interpret Results And Practical Issues

What This Part Should Cover Guidance

Part 4 - Sampling Methods

What This Part Should Cover Guidance

What a Strong Answer Covers Guidance

Follow-up Questions Guidance

Write your answer

Test whether two distributions differ

Quick Overview

Test whether two distributions differ

Constraints & Assumptions

Clarifying Questions to Ask Guidance

Part 1 - Define Hypotheses And Run Univariate Tests

What This Part Should Cover Guidance

Part 2 - Handle Multivariate Comparisons

What This Part Should Cover Guidance

Part 3 - Interpret Results And Practical Issues

What This Part Should Cover Guidance

Part 4 - Sampling Methods

What This Part Should Cover Guidance

What a Strong Answer Covers Guidance

Follow-up Questions Guidance

Write your answer