Design an ad frequency capping system
Company: Netflix
Role: Software Engineer
Category: System Design
Difficulty: medium
Interview Round: Onsite
## System Design: Ads Frequency Capping
You are designing an **ad-serving platform** (e.g., for a streaming service with ads). The product requires **frequency capping** so that a user does not see the same ad/campaign too many times.
### Requirements
#### Functional
- Given an ad request `(user_id, device_id, profile_id?, placement, timestamp, candidate_ads)`, return an ad that **respects frequency caps**.
- Support caps at multiple levels:
- **Per creative** (exact ad)
- **Per campaign / line item**
- Optionally per advertiser
- Support common cap windows, e.g.:
- `N impressions per 24 hours`
- `M impressions per 7 days`
- `K impressions per session`
- Caps must work across **devices** (same user on TV/phone/web).
- Handle **multiple profiles** under one account (if applicable): clarify whether caps are per-account, per-profile, or both.
#### Non-functional
- Very low latency (ad decisioning is on the critical path).
- High availability.
- Cap enforcement should be **close to real-time**; define acceptable staleness (e.g., up to a few seconds).
- Must scale to large traffic (think millions of QPS globally).
### What to design
- High-level architecture and data flow.
- APIs/interfaces.
- Data storage model for tracking impressions.
- How you enforce caps during ad selection.
- Trade-offs: strictness vs latency, consistency choices, and failure behavior.
### Edge cases
- Duplicate impression events (retries).
- Offline/late-arriving events.
- Multiple ad servers/regions making decisions concurrently.
- What happens if the frequency-cap store is unavailable?
Quick Answer: This question evaluates a candidate's ability to design scalable, low-latency distributed systems that enforce ad frequency caps across users, devices, and profiles, testing competencies in system architecture, data modeling, consistency, and real-time enforcement; it falls under the System Design domain and requires both high-level architectural understanding and practical application-level trade-off reasoning. It is commonly asked to assess how applicants reason about scalability, availability, latency-consistency trade-offs, API and data flow design, and failure modes when enforcing multi-window caps at high QPS.