This question evaluates a candidate's skill in causal inference and data manipulation, specifically computing a difference-in-differences estimate from panel data and testing parallel trends via pre-period treatment-control mean comparisons.
You are given observational/experiment-like panel data as three equal-length arrays:
period[i]
: an integer time period label for observation
i
(e.g., -2, -1 are pre; 0, 1 are post).
group[i]
: 1 if the unit is in the treatment group, 0 if in control.
outcome[i]
: numeric outcome.
Tasks:
max{period < 0}
)
min{period >= 0}
)
threshold
, validate that treatment and control follow similar trends in the pre period by checking:
max(d_t) - min(d_t) <= threshold
across all pre periods.
Output:
Assumptions: