Derive and implement calibration via temperature scaling

Q: Derive and implement calibration via temperature scaling

This Machine Learning question evaluates understanding of model calibration and probabilistic outputs via temperature scaling, requiring formulation of the negative log-likelihood, analytic gradient derivation with respect to a scalar temperature, and coding an optimization to learn that parameter.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Temperature Scaling for Softmax Calibration

Context

You have a trained multi-class classifier that outputs logits z(x) ∈ R^K for input x (the classifier is fixed; only calibration is learned). Temperature scaling calibrates predicted probabilities as:

p_i(x; T) = softmax(z_i(x) / T)

where T > 0 is a single scalar temperature shared across classes and inputs.

You are given a held-out validation set with logits and true labels, and you must learn T by minimizing negative log-likelihood (NLL).

Task

Write the NLL on the validation set as a function of T.
Derive the gradient of this NLL with respect to T.
Implement Python code that learns T by minimizing this NLL (e.g., gradient descent), and provide a function that applies the learned T to calibrate new predictions.

Derive and implement calibration via temperature scaling

Quick Overview

Temperature Scaling for Softmax Calibration

Context

Task

Solution

Comments (0)