Compute precision/recall from a flaky top-k API | Microsoft Interview Question