Capital One Machine Learning Interview Questions
Master your tech interview with our curated database of real questions from top companies.
Evaluate OutlierHandler Class for Code Quality and Testing
Code Review: OutlierHandler and Imputer Classes Context You are given a Python module that implements one OutlierHandler class and three Imputer class...
Diagnose Multicollinearity in Flight Delay Prediction Model
Flight Delay Prediction — Data Quality, Modeling Choice, and Multicollinearity Scenario You have historical flight operations and weather data and nee...
Evaluate Models for Credit-Risk Scoring at Capital One
Scenario You are building a production-grade credit-risk scoring model (predicting probability of default within a fixed horizon) for Capital One. The...
Build and evaluate donation propensity model
You need a model to maximize expected net revenue from solicitations. Costs: online reach costs $1 per person; gala attendance costs $100 per attendee...
Present and defend your data challenge end-to-end
10–12 Minute Interviewer-Driven Walkthrough: Recent Data Challenge Provide a concise, structured walkthrough of a real project you led end-to-end. Ass...
Build and evaluate airline delay prediction model
You are given several CSVs for the classic airline delay challenge with columns like flight_date, carrier, flight_num, origin, dest, sched_dep, sched_...
Design ML deployment with GitHub and Jenkins
Design an end‑to‑end ML deployment for a prediction model using GitHub and Jenkins: 1) Propose a repo layout (src/, features/, data_contracts/, tests/...
Choose and justify ML algorithms for tabular prediction
You must choose an algorithm for tabular prediction of arrival delay under these constraints: 500k rows, 120 features (mixed numeric/categorical with ...
Design a robust fraud detection system
Real-Time Card Fraud Detector — End-to-End Design Context - Fraud base rate ≈ 0.2% (severe class imbalance) - Labels arrive with a 14-day delay (e.g.,...
Build and validate a binary classifier
ML Pipeline with Grouped CV, Imbalance Handling, Calibration, and Thresholding Context: You have a labeled dataset where the target is is_active_30d (...
Explain MSE vs MAE, AUC, and imbalance handling
ML interview: losses, metrics, class imbalance, and thresholding Answer all parts concisely and precisely. 1) MAE vs. MSE in regression When would you...
Model flight delays with EDA and explanation
Predicting 15+ Minute Arrival Delays at Scheduled-Departure Time You are building a binary classifier that predicts whether a domestic flight will arr...
Evaluate and monitor a credit risk model
Credit-Risk PD Model: Evaluation Priorities and End-to-End Plan Context: You are deploying a consumer credit probability-of-default (PD) model for 12-...
Design a production face recognition system
Design an On-Device Face Recognition System for Mobile Access Control Context You are designing a face-based access control system for mobile devices ...
Identify Risks and Improve Imputation Class Implementations
Scenario You are reviewing three custom Python imputation classes intended for use in a scikit-learn workflow. Each class fills missing values column-...
Evaluate Python Class Design in Data Pipeline
Scenario You are reviewing a Python class used in an ML/data pipeline that follows the scikit-learn-style fit/transform pattern. Assume a typical tran...