How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

What difficulty level is this interview question?

This is a hard difficulty ML System Design question, commonly asked during Onsite rounds at Amazon.

What role is this question designed for?

This question is commonly asked for Machine Learning Engineer candidates at Amazon during technical interviews.

Design systems for global request detection and labeling

Quick Overview

This question evaluates the ability to design scalable, low-latency ML systems for global streaming event detection and rapid labeling under extreme class imbalance, assessing competencies in stream ingestion and partitioning, time-windowed aggregations, serving/alerting layers, and end-to-end labeling pipelines.

Answer the following ML system design questions. State assumptions, propose an architecture, and discuss scaling, latency, and reliability.

1) Global device request detection (streaming)

An internal platform receives IT requests from many device types across the world.

Data volume is very large.
Events update continuously.
Timestamps are precise to milliseconds .

Design a system that can quickly detect “where requests are happening” (e.g., by region/site/device type) in near real-time.

Cover:

Ingestion, partitioning/sharding, storage
Stream processing and aggregations (time windows)
Query/serving layer (dashboards/alerts)
Handling out-of-order events, duplicates, clock skew
Reliability and SLOs

2) Fast labeling under extreme class imbalance

You have a very large dataset where the positive class is extremely rare (highly imbalanced). You need to label examples quickly to build a model.

Design an end-to-end labeling strategy/pipeline. Cover:

Sampling strategy to find positives
Human-in-the-loop workflow
Weak supervision / heuristics
Active learning
How you measure progress and prevent bias

Quick Overview

1) Global device request detection (streaming)

An internal platform receives IT requests from many device types across the world.

Data volume is very large.

Events update continuously.

Timestamps are precise to milliseconds .

Design a system that can quickly detect “where requests are happening” (e.g., by region/site/device type) in near real-time.

Cover:

Ingestion, partitioning/sharding, storage

Stream processing and aggregations (time windows)

Query/serving layer (dashboards/alerts)

Handling out-of-order events, duplicates, clock skew

Reliability and SLOs

2) Fast labeling under extreme class imbalance

You have a very large dataset where the positive class is extremely rare (highly imbalanced). You need to label examples quickly to build a model.

Design an end-to-end labeling strategy/pipeline. Cover:

Sampling strategy to find positives

Human-in-the-loop workflow

Weak supervision / heuristics

Active learning

How you measure progress and prevent bias

Design systems for global request detection and labeling

Quick Overview

Design systems for global request detection and labeling

1) Global device request detection (streaming)

2) Fast labeling under extreme class imbalance

Submit Your Answer to Earn 20XP

Design systems for global request detection and labeling

Quick Overview

Design systems for global request detection and labeling

1) Global device request detection (streaming)

2) Fast labeling under extreme class imbalance

Submit Your Answer to Earn 20XP