A daily search log has one row per query string. You draw a 10% simple random sample of rows without replacement. Define a “unique query” (singleton) as a query appearing exactly once in the full day’s log. a) Explain why estimating the number of singletons by counting singletons in the 10% sample and multiplying by 10 is biased; determine the bias direction and give intuition. b) Derive a better estimator using a frequency‑of‑frequencies model: relate sampled counts f_k to population counts F_k under binomial thinning, and propose a Poisson/negative‑binomial mixture or Good–Turing/Chao‑type estimator for F_1. c) Outline how you would compute standard errors (delta method, bootstrap) and diagnose model misspecification under heavy‑tailed query frequencies. d) Describe a simulation plan to compare estimators across realistic traffic distributions.