This question evaluates a candidate's ability to design a low-latency, high-QPS, multi-tenant frequency capping service, testing competencies in identity resolution, hierarchical cap modeling, window semantics, counter and storage architecture, consistency versus latency trade-offs, real-time decisioning APIs, configuration/versioning, backfill, reporting, privacy/compliance, and failure handling. It is commonly asked to assess systems-design proficiency in distributed systems, databases, caching, API design, and operational reliability; it belongs to the System Design domain and emphasizes practical application with system-level conceptual reasoning rather than purely algorithmic detail.
Design an ads frequency capping service that limits how many times a user sees a creative or campaign within configurable time windows (e.g., 3/day per campaign, 10/week advertiser-wide). Compare it to a generic rate limiter and explain the extra components required for advertiser configuration and onboarding. Cover: identity resolution (cookies, device graph, logged-in IDs), cap dimensions (creative/ad group/campaign/advertiser), window semantics (sliding vs tumbling), counter and storage design, consistency vs latency trade-offs, real-time decisioning at high QPS, configuration APIs and validation, config propagation/versioning and rollback, backfill/migrations, reporting needs, privacy/compliance, and failure handling.
Quick Answer: This question evaluates a candidate's ability to design a low-latency, high-QPS, multi-tenant frequency capping service, testing competencies in identity resolution, hierarchical cap modeling, window semantics, counter and storage architecture, consistency versus latency trade-offs, real-time decisioning APIs, configuration/versioning, backfill, reporting, privacy/compliance, and failure handling. It is commonly asked to assess systems-design proficiency in distributed systems, databases, caching, API design, and operational reliability; it belongs to the System Design domain and emphasizes practical application with system-level conceptual reasoning rather than purely algorithmic detail.
You are designing a service that ensures a user does not see the same ad creative or campaign more than a configured number of times within specified time windows (for example, 3/day per campaign, 10/week across an advertiser). The service must support real-time ad decisioning at high QPS with low latency, multi-tenant advertiser onboarding, and reliable reporting.
Assume an ad-supported video application with high read QPS during ad selection, sub-10 ms p99 decision latency, multi-region deployment, and strict privacy/compliance requirements. The system integrates with an ad server/selector that provides candidate ads and expects an allow/deny decision before rendering.
Task
Design the frequency capping service end to end. In addition, compare it to a generic rate limiter and explain the extra components required for advertiser configuration and onboarding — this contrast is a deliberate focus of the question, not an aside.
Work through the twelve topics below. Each is a part of the design; treat them as the agenda for your whiteboard, not a checklist to recite. Per-part hints are click-to-reveal if you want a nudge.
Clarifying Questions to Ask
Before designing, scope the problem with the interviewer. Strong candidates surface questions like:
Scale and shape:
roughly how many DAU and ad opportunities per user per day? What decision QPS at peak, and how bursty? What is the candidate set size per ad request?
Latency budget:
what p99 must the capping check itself hit so the whole ad request stays under ~10 ms? Is the check synchronous on the render path?
Correctness bar:
is a cap a hard contractual guarantee or a soft business guarantee — i.e., is a rare overshoot of 1 acceptable, or must overshoot be exactly zero for some tenants?
Identity:
what identifiers are available (logged-in account, device/advertising ID, cookie, household)? Do we cap per user, per device, or per household, and is that a per-advertiser choice?
Topology and travel:
is deployment multi-region active-active? Are users sticky to a home region, and how often do they cross regions mid-session?
Privacy regime:
what consent signals gate personalized capping, and what are the retention/deletion obligations?
Constraints & Assumptions (anchor numbers — state your own explicitly)
These are illustrative anchors, not benchmarks; confirm or replace them with the interviewer, then design against whatever you state.
Scale:
on the order of
105
–
106
decision QPS at peak across regions; tens of millions to ~
108
DAU; ~10–100 candidate ads per request.
Latency:
p99 of the capping check on the order of a few ms so the full ad request stays under ~10 ms — this effectively forces an in-memory counter store on the hot path.
Cap examples:
3/day per campaign, 10/week per advertiser, hierarchical across creative → ad group → campaign → advertiser.
Correctness:
caps are a soft business guarantee for most tenants — bounded, small overshoot is acceptable; a minority of tenants may require hard caps.
What a Strong Answer Covers Premium
Part 1 — Identity resolution
Resolve a request to a stable capping identity from cookies, device graph, and logged-in IDs. Address user vs. household scoping, and what happens when identities merge or split.
Part 2 — Cap dimensions and hierarchy
Define caps at creative, ad group, campaign, and advertiser levels, including combinations. State how a single impression interacts with caps at multiple levels.
Part 3 — Window semantics
Handle sliding vs. tumbling windows and calendar anchoring (day/week), including timezone behavior.
Part 4 — Counter and storage design
Specify key schema, bucketization, TTL, atomicity across multiple caps, and an order-of-magnitude memory/scale estimate.
Part 5 — Consistency vs. latency trade-offs
Discuss within-region and across-region behavior, concurrency/race conditions, and acceptable overshoot.
Part 6 — Real-time decisioning at high QPS
Design the API and the check/reserve/commit flow, with batching/vectorized lookups and concurrency control.
Part 7 — Configuration APIs and validation
Specify the cap-definition schema, constraints, default caps, and targeting/scope.
Part 8 — Config propagation, versioning, and rollback
Describe how config moves from the source of truth to the hot path: canaries, pub/sub to caches, and freeze/rollback plans.
Part 9 — Backfill and migrations
Cover enabling new caps mid-flight, identity merges/splits, and data warmup.
Part 10 — Reporting needs
Cover aggregates, rejection reasons, near-real-time dashboards, and retention.
Part 11 — Privacy and compliance
Address consent, data minimization, retention, deletion, and cross-device linking safeguards.
Part 12 — Failure handling
Decide the policy for store outages, partial availability, and stale config: fail-open vs. fail-closed.
Follow-up Questions
Expect the interviewer to push on these after the main design:
What breaks first at 100× scale?
Where is the hottest shard (think household-level keys behind a viral placement), and how do you keep one user's traffic from melting a node?
A traveling user crosses regions mid-session.
Walk through exactly what happens to their counters and the resulting overshoot — and how a hard-cap tenant changes your answer.
An advertiser sets a cap that accidentally blocks ~all of their serving.
How does your config plane prevent the change from shipping, and how fast can you roll it back?
Reconcile the books.
How do you detect and quantify drift between live decisions and ground truth, and what do you do when they diverge?