You fit a standard linear regression model (with intercept) using ordinary least squares (OLS). Suppose you have:
-
Design matrix
X
of size
n×p
(
p
parameters including the intercept).
-
Response vector
y
of length
n
.
You now duplicate every observation once, forming a new dataset by stacking the original data under itself:
-
New design matrix
X∗=[XX]
of size
2n×p
.
-
New response vector
y∗=[yy]
of length
2n
.
You refit the same regression model on this duplicated dataset using OLS and compute the usual summary statistics.
How do the following quantities change, if at all, compared with the original fit?
-
The OLS coefficient estimates
β^
.
-
The standard errors of the coefficients.
-
The t-statistics for the coefficients.
-
R2
.
-
Adjusted
R2
.
Explain your reasoning mathematically (you may use matrix notation) and also interpret the result intuitively.