Answer the following conceptual questions:
-
Transformer architecture
-
Describe the main components of a Transformer block and what each part does.
-
GPT vs BERT
-
Explain the key differences in architecture usage and pretraining objectives.
-
When would you prefer one over the other?
-
Precision and recall
-
Define precision and recall.
-
Give an example of how changing a threshold can trade off precision vs recall.
-
Mention at least one scenario where you prioritize precision and one where you prioritize recall.