PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Statistics & Math/Snapchat

Derive logistic regression and thresholds

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of logistic regression and related competencies including derivation of the Bernoulli log-likelihood, gradient and Hessian for L2-regularized models, convexity reasoning, probability calibration and decision thresholds under asymmetric costs, numerically stable log-sigmoid expressions, and analytic treatment of class imbalance effects. It is categorized under Statistics & Math for data scientist roles and is commonly asked because it combines conceptual theoretical derivations with practical application skills, testing both mathematical reasoning (derivations and convexity proofs) and applied understanding of thresholds, calibration, numerical stability, and regularization.

  • hard
  • Snapchat
  • Statistics & Math
  • Data Scientist

Derive logistic regression and thresholds

Company: Snapchat

Role: Data Scientist

Category: Statistics & Math

Difficulty: hard

Interview Round: Onsite

Logistic regression deep dive: 1) Write the logistic (sigmoid) function σ(z) and the Bernoulli log-likelihood for binary logistic regression. Derive the gradient and Hessian with respect to β for L2-regularized logistic regression and explain why the objective is convex. 2) With a single feature x, β0 = -1.2 and β1 = 0.8, compute P(y=1 | x=2.0); report the odds and the odds ratio for a one-unit increase in x. 3) If the positive class base rate is 2% and a false negative costs 10x a false positive, compute the Bayes-optimal decision threshold and explain how you would calibrate probabilities (e.g., Platt scaling vs isotonic). 4) Give numerically stable expressions for log(σ(z)) and log(1-σ(z)) and explain why they avoid overflow/underflow. 5) Explain how severe class imbalance affects MLE estimates and which regularization or reweighting you would use; justify analytically.

Quick Answer: This question evaluates understanding of logistic regression and related competencies including derivation of the Bernoulli log-likelihood, gradient and Hessian for L2-regularized models, convexity reasoning, probability calibration and decision thresholds under asymmetric costs, numerically stable log-sigmoid expressions, and analytic treatment of class imbalance effects. It is categorized under Statistics & Math for data scientist roles and is commonly asked because it combines conceptual theoretical derivations with practical application skills, testing both mathematical reasoning (derivations and convexity proofs) and applied understanding of thresholds, calibration, numerical stability, and regularization.

Related Interview Questions

  • Compute posterior spam risk from flags - Snapchat (Medium)
  • Explain median vs mean for L1/L2 - Snapchat (medium)
  • Compute expectations and test fairness for coin flips - Snapchat (easy)
  • How to Update Bayesian Model for Concept Drift? - Snapchat (medium)
  • Calculate Posterior Probability Using Bayes' Theorem Example - Snapchat (easy)
Snapchat logo
Snapchat
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Statistics & Math
4
0

Logistic Regression Deep Dive (Binary Classification)

Assume a binary classification setting with observations {(x_i, y_i)} for i=1..n, where x_i ∈ R^p (with an intercept term) and y_i ∈ {0,1}. Let η_i = x_i^T β and σ(z) be the logistic (sigmoid) function.

Tasks

  1. Write σ(z) and the Bernoulli log-likelihood for binary logistic regression. Derive the gradient and Hessian with respect to β for L2-regularized logistic regression, and explain why the (penalized) objective is convex.
  2. Single-feature example: with x ∈ R, β0 = −1.2 and β1 = 0.8, compute P(y=1 | x=2.0). Report the odds and the odds ratio for a one-unit increase in x.
  3. Decision threshold with costs: the positive-class base rate is 2%, and a false negative costs 10× a false positive. Compute the Bayes-optimal decision threshold and explain how you would calibrate probabilities (e.g., Platt scaling vs. isotonic).
  4. Numerical stability: give numerically stable expressions for log(σ(z)) and log(1 − σ(z)) and explain why they avoid overflow/underflow.
  5. Class imbalance: explain how severe class imbalance affects MLE estimates and which regularization or reweighting you would use. Justify analytically.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Snapchat•More Data Scientist•Snapchat Data Scientist•Snapchat Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.