Capital One Data Scientist Machine Learning Interview Questions

Master your tech interview with our curated database of real questions from top companies.

16Questions

1Company

16 Questions 1 Company

Daily Quest

Showing 16 results

Capital One

Medium

Data Scientist

Evaluate OutlierHandler Class for Code Quality and Testing

Code Review: OutlierHandler and Imputer Classes Context You are given a Python module that implements one OutlierHandler class and three Imputer class...

Diagnose Multicollinearity in Flight Delay Prediction Model

Flight Delay Prediction — Data Quality, Modeling Choice, and Multicollinearity Scenario You have historical flight operations and weather data and nee...

Evaluate Models for Credit-Risk Scoring at Capital One

Scenario You are building a production-grade credit-risk scoring model (predicting probability of default within a fixed horizon) for Capital One. The...

Build and evaluate donation propensity model

You need a model to maximize expected net revenue from solicitations. Costs: online reach costs $1 per person; gala attendance costs $100 per attendee...

Present and defend your data challenge end-to-end

10–12 Minute Interviewer-Driven Walkthrough: Recent Data Challenge Provide a concise, structured walkthrough of a real project you led end-to-end. Ass...

Build and evaluate airline delay prediction model

You are given several CSVs for the classic airline delay challenge with columns like flight_date, carrier, flight_num, origin, dest, sched_dep, sched_...

Design ML deployment with GitHub and Jenkins

Design an end‑to‑end ML deployment for a prediction model using GitHub and Jenkins: 1) Propose a repo layout (src/, features/, data_contracts/, tests/...

Choose and justify ML algorithms for tabular prediction

You must choose an algorithm for tabular prediction of arrival delay under these constraints: 500k rows, 120 features (mixed numeric/categorical with ...

Design a robust fraud detection system

Real-Time Card Fraud Detector — End-to-End Design Context - Fraud base rate ≈ 0.2% (severe class imbalance) - Labels arrive with a 14-day delay (e.g.,...

Build and validate a binary classifier

ML Pipeline with Grouped CV, Imbalance Handling, Calibration, and Thresholding Context: You have a labeled dataset where the target is is_active_30d (...

Explain MSE vs MAE, AUC, and imbalance handling

ML interview: losses, metrics, class imbalance, and thresholding Answer all parts concisely and precisely. 1) MAE vs. MSE in regression When would you...

Model flight delays with EDA and explanation

Predicting 15+ Minute Arrival Delays at Scheduled-Departure Time You are building a binary classifier that predicts whether a domestic flight will arr...

Evaluate and monitor a credit risk model

Credit-Risk PD Model: Evaluation Priorities and End-to-End Plan Context: You are deploying a consumer credit probability-of-default (PD) model for 12-...

Design a production face recognition system

Design an On-Device Face Recognition System for Mobile Access Control Context You are designing a face-based access control system for mobile devices ...

Identify Risks and Improve Imputation Class Implementations

Scenario You are reviewing three custom Python imputation classes intended for use in a scikit-learn workflow. Each class fills missing values column-...

Evaluate Python Class Design in Data Pipeline

Scenario You are reviewing a Python class used in an ML/data pipeline that follows the scikit-learn-style fit/transform pattern. Assume a typical tran...

Machine Learning

Aug 4, 2025