Your product adds in-post restaurant recommendations. Design the evaluation: 1) Define goals and success metrics at viewer and creator levels (e.g., CTR on recommendations, saves, downstream bookings/orders within 24h/7d, session time, creator engagement) and guardrails (dwell quality, hide/mute rates). 2) Experiment design: unit of randomization (viewer, post, or session), handling feed/network interference, ramp plan, and holdout design for creators. 3) MDE and power: given baseline recommendation CTR = 8% and MDE = +0.4 pp, outline sample size and test horizon; address multiple comparisons across locales/categories. 4) Cold-start and bias: ensure fair exposure for new restaurants and new users; handle popularity bias and position bias (use randomization or IPS). 5) Offline vs online evaluation: offline ranking metrics (NDCG@K, MAP) on logged data, counterfactual reweighting, and exploration via bandits while preserving unbiased treatment effect estimates. 6) Risk checks: measures to prevent irrelevant or unsafe suggestions; rollback criteria and post-launch monitoring for long-term retention effects.