PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Experian

Explain PCA and L2 Normalization in Machine Learning

Last updated: Jun 15, 2026

Quick Overview

An Experian DataLabs Data Scientist technical-screen question that probes core machine-learning foundations: PCA dimensionality reduction and the right kind of normalization, deriving the logistic-regression gradient via backpropagation and generalizing to deep nets, baseline model selection, knowledge-informed ML, and decision-threshold tuning for FPR/TPR. It tests both mathematical fluency and practical model-design judgment, including the subtle point that a single threshold cannot improve TPR and FPR simultaneously without a better score ranking.

  • medium
  • Experian
  • Machine Learning
  • Data Scientist

Explain PCA and L2 Normalization in Machine Learning

Company: Experian

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Technical Screen

##### Scenario Experian DataLabs Data Scientist technical screen — a machine-learning deep-dive on the modelling choices used in your project, mixed with conceptual questions (some OA / multiple-choice style). ##### Question Walk through the core ML concepts behind a binary-classification project, covering preprocessing, modelling, optimization, and evaluation: 1. Explain how PCA achieves dimensionality reduction and why you would (or would not) apply L2 normalization before training. Distinguish per-column standardization from per-sample (row) L2 normalization, and say when each matters. 2. Derive the logistic-regression gradient via back-propagation, then generalize: describe how backpropagation works in modern multi-layer neural nets. 3. What baseline models did you compare against, and why did you ultimately choose logistic regression? 4. Define knowledge-informed machine learning and give a concrete example. 5. When and how would you move the classification threshold to improve FPR or TPR? Can you improve both FPR and TPR at the same time by moving a single threshold? ##### Hints Discuss eigenvectors/explained variance, maximum-likelihood gradients (prediction error × input), the chain rule through layers, ROC/PR curves and cost-sensitive thresholds, model-selection criteria, and domain priors/constraints.

Quick Answer: An Experian DataLabs Data Scientist technical-screen question that probes core machine-learning foundations: PCA dimensionality reduction and the right kind of normalization, deriving the logistic-regression gradient via backpropagation and generalizing to deep nets, baseline model selection, knowledge-informed ML, and decision-threshold tuning for FPR/TPR. It tests both mathematical fluency and practical model-design judgment, including the subtle point that a single threshold cannot improve TPR and FPR simultaneously without a better score ranking.

Experian logo
Experian
Aug 4, 2025, 10:55 AM
Data Scientist
Technical Screen
Machine Learning
8
0
Scenario

Experian DataLabs Data Scientist technical screen — a machine-learning deep-dive on the modelling choices used in your project, mixed with conceptual questions (some OA / multiple-choice style).

Question

Walk through the core ML concepts behind a binary-classification project, covering preprocessing, modelling, optimization, and evaluation:

  1. Explain how PCA achieves dimensionality reduction and why you would (or would not) apply L2 normalization before training. Distinguish per-column standardization from per-sample (row) L2 normalization, and say when each matters.
  2. Derive the logistic-regression gradient via back-propagation, then generalize: describe how backpropagation works in modern multi-layer neural nets.
  3. What baseline models did you compare against, and why did you ultimately choose logistic regression?
  4. Define knowledge-informed machine learning and give a concrete example.
  5. When and how would you move the classification threshold to improve FPR or TPR? Can you improve both FPR and TPR at the same time by moving a single threshold?
Hints

Discuss eigenvectors/explained variance, maximum-likelihood gradients (prediction error × input), the chain rule through layers, ROC/PR curves and cost-sensitive thresholds, model-selection criteria, and domain priors/constraints.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Experian•More Data Scientist•Experian Data Scientist•Experian Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.