Explain RL policy types and modern policy gradients | TikTok Interview Question