PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/Capital One

Design a production face recognition system

Last updated: Mar 29, 2026

Quick Overview

This question evaluates production ML system design and engineering skills specific to face recognition, covering on-device model selection and embedding design, training objectives and evaluation protocols, anti-spoofing and robustness mechanisms, privacy and template protection, performance/resource constraints, fairness and monitoring, and safe experimentation. It is asked in the Machine Learning domain for Data Scientist roles because it assesses trade-offs between privacy, security, latency, and scalability in real-world systems, and it tests both conceptual understanding of biometric and security principles and practical application for deploying and operating on-device ML.

  • hard
  • Capital One
  • Machine Learning
  • Data Scientist

Design a production face recognition system

Company: Capital One

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Onsite

Design an on-device face-recognition system for mobile access control serving 50M monthly active users with intermittent connectivity. Decide verification vs. identification and justify. Specify model family and embedding dimension, training objective (e.g., ArcFace vs. triplet loss), data scale/augmentation, and evaluation protocol (ROC/DET) with target thresholds (e.g., FAR ≤ 0.001 at TPR ≥ 0.98). Address liveness/anti-spoofing (2D/3D/IR), occlusions (masks/glasses), demographic fairness (threshold calibration across cohorts), privacy (on-device storage, differential privacy, opt-out), and security (template protection, replay attacks). Set p95 latency (<150 ms), memory (<50 MB), and battery constraints for typical mid-tier devices. Choose on-device vs. server inference and describe fallback when offline. Outline monitoring for drift and periodic re-enrollment, and how you would safely A/B test the system (shadow mode, guardrails).

Quick Answer: This question evaluates production ML system design and engineering skills specific to face recognition, covering on-device model selection and embedding design, training objectives and evaluation protocols, anti-spoofing and robustness mechanisms, privacy and template protection, performance/resource constraints, fairness and monitoring, and safe experimentation. It is asked in the Machine Learning domain for Data Scientist roles because it assesses trade-offs between privacy, security, latency, and scalability in real-world systems, and it tests both conceptual understanding of biometric and security principles and practical application for deploying and operating on-device ML.

Related Interview Questions

  • Deep-dive XGBoost handling and overfitting - Capital One (medium)
  • Build House Price Model Responsibly - Capital One (easy)
  • Design robber detection from surveillance video - Capital One (easy)
  • How would you design delay and watchlist models? - Capital One (medium)
  • Explain core ML concepts and lifecycle - Capital One (medium)
Capital One logo
Capital One
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Machine Learning
2
0

Design an On-Device Face Recognition System for Mobile Access Control

Context

You are designing a face-based access control system for mobile devices with 50M monthly active users. Devices often operate with intermittent connectivity, so the system must work fully offline and sync/update when online. The product must meet strong privacy and security requirements typical for consumer finance and protect against spoofing.

Requirements

  1. Choose between verification (1:1) and identification (1:N) and justify the choice.
  2. Specify:
    • Model family and embedding dimension.
    • Training objective (e.g., ArcFace vs. triplet loss).
    • Data scale and augmentations.
    • Evaluation protocol (ROC/DET) and target thresholds (e.g., FAR ≤ 0.001 at TPR ≥ 0.98).
  3. Address robustness:
    • Liveness/anti-spoofing (2D/3D/IR), replay attacks.
    • Occlusions (masks, glasses), low light, motion blur.
    • Demographic fairness and threshold calibration across cohorts.
  4. Privacy and security:
    • On-device storage, template protection, differential privacy, opt-out.
    • Secure model updates, integrity, and abuse prevention.
  5. Performance constraints for mid-tier devices:
    • p95 latency < 150 ms per unlock.
    • Memory footprint < 50 MB total.
    • Battery impact minimal.
  6. On-device vs. server inference and offline fallback behavior.
  7. Monitoring for data drift and periodic re-enrollment.
  8. Safe A/B testing plan (shadow mode, guardrails).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Capital One•More Data Scientist•Capital One Data Scientist•Capital One Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.