Design video captioning under compute limits | TikTok Interview Question