This question evaluates understanding of statistical performance metrics and label-noise propagation by requiring computation of precision, recall, and F1 for binary labels under known annotator sensitivity/specificity and majority-vote aggregation, in the Statistics & Math domain.
You have two independent annotators who review videos and label them as "illegal" or "legal."
Policies to evaluate:
Tasks:
Login required