Idempotency And Concurrency Control
Asked of: Software Engineer
Last updated

What's being tested
Interviewers are probing whether you can design systems that behave correctly under retries, duplicate requests, concurrent writes, and partial failures. For a Software Engineer, the key skill is turning vague reliability requirements into concrete API contracts, storage invariants, and concurrency-control mechanisms. OpenAI cares because user-facing and internal systems often run in distributed environments where clients retry, services crash mid-operation, and multiple workers may mutate the same logical state. A strong answer shows you can reason about correctness first, then choose practical mechanisms like idempotency keys, compare-and-swap, transactions, locks, MVCC, or vector clocks based on the failure model.
Core knowledge
-
Idempotency means applying the same logical operation multiple times has the same externally visible effect as applying it once.
PUT /resource/{id}is naturally idempotent;POST /chargeis not unless you add an idempotency key and persist the first result. -
Deduplication requires durable state, not just in-memory caches, if the effect being protected is durable. A common pattern, used by systems like
Stripe, is storing(client_id, idempotency_key) -> request_hash, status, response_body, expires_atand replaying the original response on retry. -
Request identity and operation identity are different. A retry should reuse the same idempotency key; a new user action should not. To prevent accidental key reuse, store a request fingerprint such as
SHA256(method, path, canonical_body)and reject mismatches with409 Conflict. -
Exactly-once execution is usually not achievable end-to-end in distributed systems; practical systems provide at-least-once delivery plus idempotent side effects. The goal is “exactly-once observable effect,” achieved with atomic writes, dedupe tables, conditional updates, or transactional outbox patterns.
-
Atomicity boundaries matter. If you write
ordersand then writeidempotency_keys, a crash between them can create duplicates. Prefer one database transaction: insert dedupe record, perform mutation, store response, then commit. InPostgres, useINSERT ... ON CONFLICTplus row-level locking. -
Concurrency control choices trade throughput for simplicity. Pessimistic locking serializes conflicting operations with mutexes or
SELECT ... FOR UPDATE; optimistic concurrency control reads a version, computes changes, then commits only ifversion = old_version, retrying on conflict. -
Compare-and-swap is the core primitive behind many safe updates:
UPDATE accounts SET balance = balance - 10, version = version + 1 WHERE id = ? AND version = ?. If affected rows = 0, another writer won; retry or surface a conflict. -
Isolation levels determine which anomalies can occur.
READ COMMITTEDmay allow lost updates unless guarded by conditional updates;REPEATABLE READavoids non-repeatable reads;SERIALIZABLEis safest but can reduce throughput via aborts. Know anomalies: lost update, write skew, phantom read. -
MVCC stores multiple versions so readers do not block writers. In an in-memory database design, each record can carry
(value, version/timestamp); transactions read from a snapshot and validate write sets at commit. This supports high read concurrency but needs garbage collection of old versions. -
Lock granularity is a major design lever. A global lock is simple but caps throughput; per-key locks scale for independent keys; range locks are needed for predicates like “all keys with prefix X.” For hot keys, consider batching, sharding by sub-key, or single-writer queues.
-
Vector clocks capture partial ordering in distributed updates. A vector clock
VdominatesWif for every nodei,V[i] >= W[i]and for somej,V[j] > W[j]; otherwise the versions are concurrent. They are useful when multiple replicas accept writes and conflicts must be detected, not overwritten silently. -
TTL and retention are correctness parameters, not cleanup details. Idempotency records must live at least as long as client retry windows and network uncertainty, often 24 hours to several days. Too short creates duplicate effects; too long increases storage and may block legitimate key reuse.
Worked example
For Prevent Duplicate Request Processing, a strong candidate starts by clarifying: “Are duplicate requests caused by client retries, load balancer retries, worker crashes, or all of them? What side effect are we protecting: payment, account creation, job enqueue, or state mutation? Do clients supply an idempotency key, or must the server generate operation identity?” Then they would state an assumption: the operation is non-idempotent, such as creating a charge or consuming credits, and clients retry on timeout.
The answer skeleton should have four pillars. First, define the API contract: clients send Idempotency-Key, scoped by user or tenant, and must reuse it for retries of the same logical action. Second, persist a dedupe record in a durable store with fields like key, request_hash, status, response, and expires_at. Third, make the dedupe check and business mutation atomic using a database transaction, conditional insert, or row lock. Fourth, specify retry behavior: if the key is complete, return the stored response; if in progress, return 409, 202, or block briefly; if the request hash differs, reject.
A key tradeoff to flag is whether to store the full response or only the resulting resource ID. Full response replay gives clients stable behavior across retries, but consumes more storage and may expose stale formatting if response schemas change. Storing only the resource ID is lighter but requires reconstructing the response and handling cases where downstream state changed.
A good close would be: “If I had more time, I’d discuss TTL sizing, multi-region behavior, observability for duplicate suppression rate, and how to test crash points between dedupe insert, side effect, and commit.”
A second angle
For Design an in-memory database, the same ideas appear inside the storage engine rather than at the API edge. Instead of deduplicating HTTP requests, you must prevent inconsistent reads and writes when many clients call GET, SET, DELETE, or transaction APIs concurrently. The design might begin with a thread-safe hash map plus per-key locks, then evolve toward snapshot isolation using MVCC if readers need consistent views without blocking writers. The important constraint shift is that latency may be microseconds to low milliseconds, so a simple global mutex is often unacceptable beyond small workloads. You should explicitly discuss whether the database supports single-key atomic operations only, multi-key transactions, or serializable transactions, because each choice changes the concurrency-control design.
Common pitfalls
Pitfall: Treating idempotency as “just retry safely.”
The tempting answer is “make the endpoint idempotent and retry with exponential backoff,” but that skips the hard part: where operation identity is stored and how it is atomically tied to the side effect. A better answer names the dedupe table, the unique constraint, the transaction boundary, and what happens after a crash or timeout.
Pitfall: Overusing global locks.
A global mutex is easy to explain for an in-memory key-value store, but it usually fails the scalability discussion. It is acceptable as a baseline, but you should quickly move to per-key locks, lock striping, optimistic concurrency, or MVCC, and explain which operations still require broader coordination.
Pitfall: Ignoring ambiguous outcomes.
If a client times out after sending a request, it may not know whether the server committed the mutation. The wrong answer is to let the client “try again and hope”; the stronger answer is to make retries query or reuse the same idempotency record so the system can return the original outcome deterministically.
Connections
Interviewers may pivot from here into distributed transactions, consensus with Raft, database isolation levels, replication conflict resolution, or rate limiting for retry storms. For in-memory database variants, expect follow-ups on persistence via write-ahead logging, sharding, and hot-key mitigation.
Further reading
-
Designing Data-Intensive Applications — Martin Kleppmann’s chapters on replication, transactions, and consistency are directly relevant to these designs.
-
Stripe API Idempotent Requests — practical example of idempotency keys, response replay, and request-parameter validation.
-
Time, Clocks, and the Ordering of Events in a Distributed System — foundational paper for reasoning about ordering, causality, and concurrent distributed events.
Featured in interview prep guides
Practice questions
- Design a Distributed Rate LimiterOpenAI · Software Engineer · Technical Screen · none
- Prevent Duplicate Request ProcessingOpenAI · Software Engineer · HR Screen · hard
- Design a scalable payment systemOpenAI · Software Engineer · Technical Screen · medium
- Design credit balance with vector-clock expirationsOpenAI · Software Engineer · Technical Screen · hard
- Implement GPU credit ledgerOpenAI · Software Engineer · Technical Screen · Medium
- Optimize C++ Performance with Provided ConcurrencyOpenAI · Software Engineer · Technical Screen · Medium
- Implement a GPU credit managerOpenAI · Software Engineer · Technical Screen · Medium
- Implement an expiring GPU credits ledgerOpenAI · Software Engineer · Technical Screen · Medium
- Design a scalable payment processorOpenAI · Software Engineer · Technical Screen · hard
Related concepts
- Distributed Systems Correctness And IdempotencySystem Design
- Fault Tolerance, Idempotency, And Concurrency ControlSystem Design
- Concurrency ControlSystem Design
- Idempotent API DesignSystem Design
- Distributed Systems Consistency And Low-Latency DesignSystem Design
- Concurrency, Deadlocks, And SynchronizationSoftware Engineering Fundamentals