Amazon Software Engineer Interview Prep Guide
Everything Amazon actually asks Software Engineer candidates — concept walkthroughs, worked examples, and the real interview questions, drawn from candidate reports. Free to read.
Last updated

Technical Screen
Behavioral & Leadership
- Leadership Principles And STAR Stories — covered in depth under Take-home Project below.
Coding & Algorithms
-
Arrays, Strings, Hash Maps, And Frequency Counting — covered in depth under Take-home Project below.
-
Interval Merging And Range Manipulation — covered in depth under Onsite below.
-
DFS/BFS Tree, Graph, And Grid Traversal — covered in depth under Onsite below.
-
Dynamic Programming And Memoization — covered in depth under Take-home Project below.
-
Greedy, Heaps, And Scheduling Optimization — covered in depth under Take-home Project below.
-
LRU Cache And O(1) Data Structures — covered in depth under Onsite below.
System Design
- Real-Time Top-K And Streaming Analytics — covered in depth under Onsite below.
Software Engineering Fundamentals
- Object-Oriented Design And Concurrency-Safe LLD — covered in depth under Onsite below.
Onsite
Behavioral & Leadership
- Leadership Principles And STAR Stories — covered in depth under Take-home Project below.
Coding & Algorithms
- Arrays, Strings, Hash Maps, And Frequency Counting — covered in depth under Take-home Project below.

What's being tested
This tests the interval merging pattern: sort ranges, scan once, and maintain the last merged range while deciding overlap, adjacency, or separation. Interviewers probe boundary reasoning, especially closed vs half-open intervals, contiguous ranges, nested intervals, empty input, and O(n log n) vs O(n) tradeoffs when input is already sorted.
Patterns & templates
-
Sort-and-scan merge — sort by
start, then extend currentend;O(n log n)time,O(n)output space. -
Insertion into sorted intervals — copy intervals before
newInterval, merge overlaps, then append the rest;O(n)when already sorted. -
Adjacency rule — for closed intervals, merge if
next.start <= cur.end + 1; for half-open intervals, merge ifnext.start <= cur.end. -
Overlap predicate — two ranges overlap when
a.start <= b.end && b.start <= a.end; adjust carefully for half-open[start, end). -
Canonical output invariant — maintain sorted, non-overlapping, non-adjacent merged intervals after every append or update.
-
Deletion / subtraction — split an interval into left and right remainders around the removed range; handle full coverage and no-overlap cases.
-
Complexity clarity — if input is unsorted, sorting dominates; if pre-sorted, merging is linear and usually constant auxiliary space excluding output.
Common pitfalls
Pitfall: Treating adjacent intervals inconsistently, such as merging
[1,3]and[4,5]without confirming whether adjacency should count.
Pitfall: Mutating the input interval objects directly when the interviewer expects a fresh result or when aliasing can corrupt later comparisons.
Pitfall: Forgetting edge cases: empty list, one interval, nested intervals, duplicate starts, negative bounds, and integer overflow from
end + 1.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions

What's being tested
Graph traversal skill across trees, grids, and network-like graphs: choosing DFS, BFS, or Dijkstra based on reachability, shortest unweighted distance, or weighted cost. Interviewers probe whether you maintain correct state, avoid revisits, handle branching/obstacles/errors, and explain time/space complexity clearly.
Patterns & templates
-
BFS shortest path on unweighted graphs — use
queue,visited, and level counting;O(V + E)time,O(V)space. -
Multi-source BFS for spreading processes — enqueue all initial rotten/spoiled cells first; each BFS layer represents one time unit.
-
DFS path matching in an N-ary tree — recurse with
(node, index)state; backtracking is implicit, and duplicate values require sequence-position tracking. -
Grid traversal template — iterate four directions via
dirs = [(1,0),(-1,0),(0,1),(0,-1)]; validate bounds, obstacles, and visited cells. -
Dijkstra-style traversal for weighted cells — use
heapqwith(cost, r, c); mark finalized distances, not merely first-seen nodes. -
Web crawler BFS — normalize URLs, enforce same-domain scope, dedupe with
visited, and separate fetch failures/retries from traversal correctness. -
Complexity framing — trees are
O(n), grids areO(rows * cols), crawlers areO(pages + links)bounded by crawl limit and dedupe set.
Common pitfalls
Pitfall: Using DFS for shortest path in an unweighted grid without tracking all alternatives; BFS is the canonical shortest-step solution.
Pitfall: Marking weighted grid nodes visited when pushed into the heap; with Dijkstra, finalize when popped with the minimum known distance.
Pitfall: Matching N-ary tree paths by value only; repeated values mean your state must include the current sequence index.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions
-
Dynamic Programming And Memoization — covered in depth under Take-home Project below.
-
Greedy, Heaps, And Scheduling Optimization — covered in depth under Take-home Project below.

What's being tested
This tests O(1) mutable data structure design, usually combining a hash map with a doubly linked list to implement cache lookup, recency updates, insertion, and eviction. Interviewers are probing whether you can preserve invariants under get, put, update-existing, capacity overflow, and missing-key cases.
Patterns & templates
-
Hash map + doubly linked list — map
key -> node; list stores recency order;getandputareO(1)average time. -
Sentinel head/tail nodes simplify list mutations —
addToFront(node),remove(node),moveToFront(node),popTail()avoid null-heavy edge cases. -
LRU semantics — successful
get(key)updates recency;put(existingKey, value)updates value and recency; eviction removes least-recently-used node. -
Capacity handling — after inserting a new key, if
size > capacity, evict tail node and delete its key from the map. -
Cache optimization framing — define cache key, cached value, invalidation/TTL, memory bound, expected hit rate, and worst-case behavior before coding.
-
Complexity contract — state
O(1)average time due to hash map,O(capacity)space; note hash collision worst case if interviewer asks. -
Thread-safety extension — single mutex is simple and correct; finer-grained locking improves throughput but complicates recency-list consistency.
Common pitfalls
Pitfall: Using only a hash map gives fast lookup but no
O(1)way to find and remove the least-recently-used item.
Pitfall: Forgetting that
getmust update recency causes subtle failures on eviction-order tests.
Pitfall: Evicting from the list but not deleting from the map leaves stale nodes and incorrect future lookups.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions
System Design

What's being tested
These interviews test whether you can design low-latency streaming analytics that maintain accurate or approximate rankings while events arrive continuously, out of order, and at high volume. The interviewer is probing for practical distributed-systems judgment: how you partition work, maintain state, bound memory, recover from failures, and serve fresh answers under `p99` latency targets. Amazon cares because many systems need near-real-time counters and rankings: best-selling products, ad clicks, abuse signals, log histograms, popular searches, and operational dashboards. A strong Software Engineer answer balances correctness, scalability, simplicity, and operational failure modes instead of just naming `Kafka` plus a database.
Core knowledge
-
Top-K frequency tracking usually means maintaining the K most frequent keys over a stream: product IDs, search prefixes, ad IDs, log fields, or URLs. Exact global top-K requires counting every key, which is feasible for bounded cardinality but expensive when unique keys reach millions or billions.
-
Exact counting uses a hash map from key to count plus a min-heap of size K for candidates. Updates are for the counter and when heap membership changes; memory is for N distinct keys, which may be acceptable up to ~10M keys depending on key size and retention.
-
Approximate heavy-hitter algorithms trade precision for bounded memory. Count-Min Sketch estimates frequency with error bounded by using width and depth ; Space-Saving keeps M counters and works well when you only need likely top items.
-
Window semantics are central. A tumbling window such as “top-K per minute” is simpler because events belong to one bucket; a sliding window such as “last 15 minutes updated every second” requires bucketed subwindows, incremental expiration, or time-decayed counts.
-
Event time differs from processing time. If ranking by when a click actually happened, use event timestamps and accept late arrivals up to a watermark delay, for example “finalize window after 2 minutes of lateness.” If ranking by ingestion time, results are simpler but less faithful during client buffering or retries.
-
Partitioning strategy decides scalability. Hash partition by item key when aggregating counts for each key, then periodically emit local top-K candidates to a global reducer. Partitioning by user or session helps deduplication, but final top-K by item then requires a second aggregation stage.
-
Hot keys are a real failure mode. If one product, log category, or query prefix receives 30% of traffic, a single partition can overload. Mitigations include key salting, local pre-aggregation, adaptive partitioning, or separating ultra-hot keys into dedicated paths.
-
State management needs a clear recovery story. Stream processors like
`Apache Flink`,`Kafka Streams`, or`Spark Structured Streaming`keep local state backed by checkpoints or changelogs. Without checkpointing, a worker crash can reset counts or double-apply replayed events. -
Idempotency and deduplication matter for click and log systems. If producers retry, events may arrive twice; use an event ID, request ID, or composite key like
(user_id, item_id, timestamp, nonce)with bounded dedup state. Exact dedup across an infinite stream is impossible without unbounded memory. -
Serving path should be separated from computation. The streaming job writes materialized results like
(window, dimension, rank, item, count)to a low-latency store such as`Redis`,`DynamoDB`,`Cassandra`, or`OpenSearch`; read APIs should not scan raw events to answer top-K. -
Freshness versus correctness is a first-class tradeoff. For example, serving provisional top-K every second with watermark-corrected final values later may be better than waiting two minutes for perfect completeness. Make the SLA explicit: “results within 5 seconds, tolerate 0.5% error, late events included for 10 minutes.”
-
Autocomplete top-K combines streaming counts with prefix indexing. A common approach stores a trie or prefix table where each prefix maps to top suggestions; updates increment query/item counts and refresh affected prefixes. This is fast for reads but expensive for writes because one term touches prefixes.
Worked example
For Design rolling-window top-K click tracker, start by clarifying the API and SLA: “Are we returning top-K items globally or per advertiser/category? What is the window length, expected QPS, acceptable staleness, and do clicks have unique IDs?” Then declare assumptions, such as 1M clicks/sec, 15-minute sliding window, K=100, results refreshed every second, and late events accepted for 2 minutes.
A strong answer can be organized around four pillars: ingestion, aggregation, state/windowing, and serving. For ingestion, describe clients or edge services publishing click events into `Kinesis` or `Kafka`, with event ID, item ID, timestamp, and optional dimensions. For aggregation, use a stream processor such as `Flink` that partitions by item ID and maintains per-item counts in small time buckets, for example 900 one-second buckets for a 15-minute window. For top-K, each partition computes local candidates, then a second-stage reducer merges candidates and publishes the global ranked list.
The key design decision is exact versus approximate. Exact rolling counts require maintaining all active item counts and expiring old buckets, which is straightforward if active item cardinality is manageable; for very high cardinality, use Space-Saving or Count-Min Sketch to bound memory, while clearly explaining possible ranking errors near the cutoff. Serving should use a materialized store like `Redis` sorted sets or a `DynamoDB` table keyed by window and dimension, not recompute from raw clicks per request. Close by saying that with more time you would cover multi-region failover, backfill/reconciliation from durable logs, and observability metrics like ingestion lag, watermark lag, dropped duplicates, and `p99` query latency.
A second angle
For Implement autocomplete with top-K suggestions, the same core idea appears, but the read path becomes more latency-sensitive and prefix-specific. Instead of computing one global top-K, you maintain top suggestions for many prefixes, often with a trie, finite-state transducer, or prefix-to-candidates index. Updates are more expensive because a query like “headphones” affects h, he, hea, and every longer prefix, so many systems batch updates or separate real-time trending boosts from a slower offline rebuild. The tradeoff shifts from stream-window correctness to memory layout, Unicode normalization, case folding, deletion handling, and sub-10ms lookup latency.
Common pitfalls
Pitfall: Treating top-K as “just sort all events.”
Sorting raw events or scanning all counts on every query is the tempting simple answer, but it fails immediately at streaming scale. A better answer maintains incremental state: counters, heaps, sketches, or precomputed materialized rankings that make reads cheap.
Pitfall: Ignoring window and time semantics.
Saying “keep a counter per item” is incomplete if the question asks for a rolling 5-minute or 24-hour view. You need to explain how counts expire, whether timestamps use event time or processing time, and what happens to late events after a watermark.
Pitfall: Naming technologies without showing data flow.
An answer like “use `Kafka`, `Flink`, and `Redis`” sounds plausible but shallow. Interviewers want to hear what each component does, what state it owns, how data is partitioned, how failures are recovered, and where the top-K is actually computed.
Connections
Interviewers often pivot from streaming top-K into rate limiting, distributed counters, leaderboards, log analytics, or search/autocomplete indexing. They may also ask about consistency models, checkpointing, backpressure, or approximate data structures such as Bloom filters, HyperLogLog, Count-Min Sketch, and reservoir sampling.
Further reading
-
Streaming Systems by Tyler Akidau, Slava Chernyak, and Reuven Lax — the best practical treatment of event time, watermarks, windows, and triggers.
-
Count-Min Sketch paper: “An Improved Data Stream Summary” — foundational algorithm for approximate frequency estimation in large streams.
-
Space-Saving algorithm: “Efficient Computation of Frequent and Top-k Elements in Data Streams” — directly relevant to memory-bounded heavy-hitter tracking.
Practice questions
What's being tested
Interviewers are probing whether you can design a reliable distributed scheduler that turns future-time requests into actual execution while handling scale, failures, duplicate delivery, and clock/time-zone edge cases. For Amazon-style systems, this matters because many services depend on deferred work: retries, reminders, payments, fulfillment workflows, cleanup, batch fanout, and SLA-driven automation. A strong Software Engineer answer shows you can separate job definition, schedule computation, durable persistence, dispatch, and worker execution, then reason about the failure modes between each boundary. The interviewer is not looking for a single “perfect” architecture; they want crisp tradeoffs around correctness, latency, throughput, and operability.
Core knowledge
-
One-off jobs and recurring jobs should be modeled differently. A one-off job stores
run_at; a recurring job stores a schedule expression such ascron,interval, orrate, plustimezone,next_run_at, and recurrence metadata. Store future occurrences lazily rather than materializing infinite schedules. -
Durable storage is the source of truth. Use
DynamoDB,Postgres,MySQL, or similar to persistjob_id,tenant_id,run_at,payload_ref,status,attempt_count,dedupe_key, and timestamps. A scheduler that only keeps jobs in memory loses work on restart and usually fails the reliability bar. -
Scheduling algorithms depend on scale and latency targets. A single-node min-heap works for small systems, with
O(log n)insert and pop. At larger scale, use time-bucketed partitions, database range scans onrun_at, or a hierarchical timing wheel for high-throughput near-future timers. -
Polling plus claiming is a common durable pattern. Scheduler nodes query due rows where
run_at <= now()andstatus = SCHEDULED, then atomically transition toCLAIMEDusing compare-and-swap,SELECT ... FOR UPDATE SKIP LOCKED, or conditional writes. Claiming prevents many schedulers from dispatching the same job simultaneously. -
Sharding keeps due-job scans bounded. Partition by time bucket and hash, for example
bucket = floor(run_at / 60s)andshard = hash(job_id) % N. This gives roughly and avoids one hot partition for a popular timestamp like midnight. -
Delivery semantics are usually at-least-once, not exactly-once. Network failures make “did the worker execute?” ambiguous. Design for duplicate dispatch using idempotency keys, dedupe tables, conditional state transitions, and business-level safeguards rather than promising exactly-once execution.
-
Idempotency belongs at the job effect boundary. If the job sends an email, charges a card, or updates an order, pass a stable
idempotency_keyto the downstream service. Stripe-style idempotency keys are a useful mental model: repeated requests with the same key should produce one logical effect. -
Retries need bounded policy, not infinite loops. Store
attempt_count,max_attempts,last_error, andnext_retry_at. Use exponential backoff with jitter, e.g.delay = min(base * 2^attempt, max_delay) + random_jitter, to avoid thundering herds after regional or dependency outages. -
Leases handle scheduler and worker crashes. A claimed job should have
lease_until; if a node dies, another node can reclaim it after the lease expires. Use fencing tokens or monotonically increasingclaim_versionto prevent stale workers from committing after their lease is no longer valid. -
Recurring schedules require careful time handling. Store timestamps in UTC, but preserve the user’s
timezonefor computing the next local occurrence. Daylight Saving Time creates nonexistent times, duplicated times, and “last day of month” ambiguity; state your policy explicitly. -
Queues decouple dispatch from execution. The scheduler can enqueue due work into
SQS,Kafka,RabbitMQ, or an internal work queue, while workers consume and execute. Note limits:SQSdelay queues are useful for short delays but max delay is 15 minutes, so long-term scheduling still needs durable storage. -
Observability should include correctness and latency signals. Track
schedule_lag = dispatch_time - run_at, queue depth, due jobs per shard, claim conflicts, retry rate, dead-letter count, worker success rate, andp99dispatch latency. These metrics reveal hot shards, stuck schedulers, and downstream dependency failures.
Worked example
For “Design a scalable job scheduler”, a strong candidate would start by clarifying the scale and semantics: “Are jobs one-off, recurring, or both? What is acceptable scheduling latency, seconds or minutes? Is execution at-least-once acceptable if workers are idempotent? What job volume and payload size should I assume?” Then they might declare assumptions: 100M stored jobs, 100K due per minute at peak, second-to-minute precision, and at-least-once delivery.
The answer should be organized around four pillars: API and data model, durable scheduling store, scheduler/dispatcher architecture, and worker execution with retries and idempotency. For the API, propose CreateJob, CancelJob, GetJob, and maybe CreateRecurringSchedule, with stable job_id and optional client-provided dedupe_key. For storage, describe a table indexed by run_at or time buckets, plus sharding by hash to avoid a single partition scanning all due work.
For dispatch, explain that multiple scheduler nodes poll assigned shards, atomically claim due jobs, enqueue them to a work queue, and mark them dispatched only after successful enqueue. For workers, describe idempotent execution, retry with backoff, and a dead-letter state after max_attempts. One tradeoff to flag explicitly: scanning durable storage every second is simple but expensive at very high scale; time buckets or a timing wheel reduce scans but add complexity in rebalancing and recovery. Close by saying that with more time you would drill into multi-region behavior, recurring-job DST rules, operational dashboards, and backpressure during dependency outages.
A second angle
For “Design delayed job scheduler (LLD)”, the same concepts apply, but the interviewer is likely pushing for lower-level class design and concurrency behavior. Instead of starting with multiple regions and millions of tenants, focus on interfaces like JobStore, DelayQueue, SchedulerLoop, WorkerPool, and RetryPolicy. You should be ready to discuss whether the in-memory structure is a priority queue, delay queue, or timing wheel, and how it reloads from durable storage after restart. The key distinction is that an LLD answer needs concrete state transitions, locking or atomic update behavior, and testable components rather than only boxes on an architecture diagram. You can still mention distributed concerns, but anchor them in methods such as claimDueJobs(now, limit), extendLease(jobId, token), and completeJob(jobId, token).
Common pitfalls
Pitfall: Promising exactly-once execution.
A tempting answer is “the scheduler will ensure each job runs exactly once.” In distributed systems, a scheduler can crash after dispatching but before recording success, or a worker can complete while the acknowledgment is lost. A better answer is: “The scheduler provides at-least-once delivery, and downstream effects are protected with idempotency keys and conditional writes.”
Pitfall: Treating time as simple.
Many candidates say “store a cron expression and run it at the right time” without addressing time zones, DST, leap days, or clock skew. Stronger candidates store execution timestamps in UTC, preserve the user’s intended timezone for recurrence computation, use NTP-synchronized clocks, and state a policy for nonexistent or duplicated local times.
Pitfall: Drawing components without explaining state transitions.
A box diagram with API -> DB -> Queue -> Workers is not enough. The interviewer needs to hear how a job moves from SCHEDULED to CLAIMED to ENQUEUED to RUNNING to SUCCEEDED or FAILED, and what happens if a process dies between any two steps.
Connections
This topic often pivots into distributed queues, leader election, workflow orchestration, rate limiting, and idempotent API design. Be prepared to compare a custom scheduler with managed systems such as SQS, EventBridge Scheduler, Quartz, Airflow, or Temporal, especially around durability, retries, dependencies, and operational complexity.
Further reading
-
Designing Data-Intensive Applications — excellent background on replication, partitions, logs, consistency, and failure handling in distributed systems.
-
The Google File System paper — useful for understanding leases, master coordination, and failure-tolerant distributed design patterns.
-
Amazon Builders’ Library: Timeouts, retries, and backoff with jitter — directly relevant to retry policy, overload control, and avoiding retry storms.
Practice questions
Software Engineering Fundamentals
What's being tested
Interviewers are probing whether you can turn an ambiguous product workflow into a clean object-oriented design, then make it safe under concurrent access. For a Software Engineer, this means more than naming classes: you must define responsibilities, state transitions, invariants, APIs, persistence boundaries, and failure behavior. Amazon cares because many production systems look “simple” at the feature level—tasks, carts, orders, reservations—but fail when multiple users, services, retries, or devices act on the same entity at once. A strong answer shows practical judgment: simple in-memory design when appropriate, database-backed consistency when needed, and extensibility without over-engineering.
Core knowledge
-
Domain modeling starts with nouns, verbs, and invariants. For a task system:
Task,User,Project,Comment,Assignee; for a restaurant:Table,Reservation,Order,MenuItem,Bill. The important part is not the nouns themselves, but who owns state and which transitions are legal. -
Single Responsibility Principle means each class should have one reason to change.
Ordercan own order-line state, but pricing rules often belong inPricingServiceorPriceCalculator; payment authorization belongs behind aPaymentProcessorinterface, not insideOrder. -
State machines are essential for workflows. Model states explicitly:
CREATED -> CONFIRMED -> PREPARING -> READY -> COMPLETED, orOPEN -> ASSIGNED -> IN_PROGRESS -> DONE. Define invalid transitions and enforce them centrally, preferably through methods likeorder.markReady()rather than arbitrary field mutation. -
Invariants are the rules that must always hold, even under concurrency. Examples: inventory cannot go below zero, one table cannot have two active reservations for the same time slot, a task cannot be assigned to a deleted user, and an order cannot be paid twice. State these aloud before choosing locks or transactions.
-
API design should expose use cases, not database tables. Prefer operations like
POST /orders/{id}/items,POST /tasks/{id}/assign, orPOST /reservations/{id}/cancelover genericPATCHfor every field when transitions have business rules. Include idempotency for retryable operations such as checkout. -
Composition over inheritance is usually better for extensibility. A pizza order should compose
Crust,Size,Topping, andPricingRule; avoid a class explosion likeLargeThinCrustPepperoniPizza. Use Strategy pattern for pricing, discounts, tax, or payment providers. -
Optimistic concurrency control works well when conflicts are rare. Add a
versioncolumn and update withWHERE id = ? AND version = ?; if zero rows are updated, reload and retry or return a conflict. This is common for task edits, cart updates, and profile-like entities. -
Pessimistic locking is appropriate when conflicts are likely or consequences are severe. In
Postgres,SELECT ... FOR UPDATEcan lock an inventory row during purchase. Keep transactions short; never hold a database lock while calling an external payment provider. -
Idempotency keys protect against duplicate client retries. For checkout, accept an
Idempotency-Keyheader and store the request key with the resultingOrderor payment attempt. If the same key is retried, return the original result instead of charging twice. -
Consistency boundaries should be explicit. A single aggregate, such as
OrderplusOrderItems, can often be updated transactionally. Cross-aggregate workflows—inventory reservation, payment, fulfillment—may need sagas, status fields, compensating actions, and reconciliation jobs rather than one giant transaction. -
Thread safety matters for in-memory components. Shared mutable structures like
Map<TaskId, Task>require synchronization,ConcurrentHashMap, immutable value objects, or actor-style ownership. Remember thatConcurrentHashMapmakes individual map operations safe, not multi-step invariants like “check then insert if capacity remains.” -
Event ordering matters in input systems and workflow systems. Keyboard/mouse events need monotonic sequence numbers, timestamps, and deterministic replay; order systems need append-only history for debugging. A log like
Event(seq, deviceId, type, payload, timestamp)helps reproduce bugs and reason about causality.
Worked example
For Design an OOD restaurant management system, a strong candidate would first clarify scope: “Are we designing for dine-in only, or also delivery? Do we need reservations, table assignment, ordering, kitchen status, billing, and payment? Is this single restaurant or multi-location?” Then they would declare assumptions: single location, multiple staff using terminals concurrently, and core flows covering reservation, seating, order placement, kitchen fulfillment, bill splitting, and payment.
The answer skeleton should have four pillars: domain model, state transitions, service/API layer, and concurrency controls. The domain model might include Restaurant, Table, Reservation, Party, Order, OrderItem, MenuItem, KitchenTicket, Bill, and Payment. State transitions matter: a Reservation moves from REQUESTED to CONFIRMED to SEATED or CANCELLED; an OrderItem moves from PLACED to IN_PREP to SERVED.
The candidate should explicitly separate responsibilities: TableManager assigns tables, OrderService mutates orders, KitchenService manages tickets, BillingService computes totals, and PaymentService integrates with a payment processor. A concrete tradeoff to flag is reservation concurrency: two hosts might try to assign the same table for overlapping times, so table assignment should happen inside a transaction with a uniqueness constraint or lock on the table/time slot. Another useful tradeoff is bill splitting: you can support equal split first, then item-level split later by modeling BillLineItem ownership.
A strong close would be: “If I had more time, I’d add audit logs for staff actions, offline-terminal sync, menu availability windows, and failure handling around partial payment authorization.”
A second angle
For Design a keyboard and mouse input system, the same design skill applies, but the domain is lower-level and more event-driven. Instead of modeling business aggregates like Order and Reservation, you model InputEvent, KeyboardEvent, PointerEvent, FocusTarget, TextComposition, and EventDispatcher. The key invariant is not “do not oversell inventory,” but “deliver events in deterministic order to the correct focused component while preserving composition and pointer semantics.”
Concurrency shows up differently: hardware events, UI thread dispatch, accessibility hooks, and replay logs may operate on different threads. A strong design would use an event queue with sequence numbers, immutable event objects, and a single-threaded dispatch loop or carefully synchronized handoff. The transferable concept is the same: define ownership of mutable state, legal transitions, ordering guarantees, and the boundary where external inputs become internal state changes.
Common pitfalls
Pitfall: Starting with a class diagram before clarifying workflows.
A tempting weak answer is to list classes like Book, User, Cart, Order, or Task immediately. That misses what interviewers care about: behavior under real operations. Start with 3–5 core flows, then derive classes and methods from those flows.
Pitfall: Treating concurrency as “add a lock” without naming the invariant.
Saying “I’ll synchronize the method” is too vague. A better answer is: “The invariant is that inventory cannot go negative; I’ll enforce it with a transaction and conditional update, UPDATE inventory SET quantity = quantity - 1 WHERE sku = ? AND quantity > 0, then check affected rows.”
Pitfall: Overusing inheritance and patterns.
Designs like VegPizza extends Pizza, CheesePizza extends VegPizza, and LargeCheesePizza extends CheesePizza become brittle fast. Prefer small interfaces, composition, and strategies: Pizza has Size, Crust, List<Topping>, and pricing rules that can evolve independently.
Connections
Interviewers often pivot from this topic into system design, especially when an object model becomes a service with persistence, caching, and APIs. They may also pivot into database transactions, distributed consistency, idempotent API design, or design patterns such as Factory, Strategy, Observer, Repository, and State.
Further reading
-
Design Patterns: Elements of Reusable Object-Oriented Software — the classic source for Strategy, Observer, Factory, and State patterns.
-
Martin Fowler, Patterns of Enterprise Application Architecture — practical patterns for service layers, repositories, transactions, and domain models.
-
Stripe API Idempotent Requests — clear real-world explanation of retry-safe API design for payment-like workflows.
Practice questions
Take-home Project
Behavioral & Leadership

What's being tested
Amazon behavioral interviews test whether you can demonstrate Leadership Principles through concrete engineering judgment, not whether you can recite values. For a Software Engineer, the interviewer is probing how you debug ambiguous systems, make tradeoffs under pressure, own production outcomes, raise quality bars, and work across teams without authority. Strong answers show specific technical context, measurable impact, and personal accountability: what you did, why it mattered, what alternatives you considered, and what you learned. Expect deep follow-ups on details like failure modes, metrics, incident timelines, code-review decisions, and whether your actions would scale beyond one heroic fix.
Core knowledge
-
STAR means Situation, Task, Action, Result. For Amazon, the Action section should be the longest: roughly 10% context, 10% goal, 60% your decisions/actions, 20% measurable results and lessons.
-
Ownership stories should show you acted beyond your narrow ticket. Good SWE examples include fixing a flaky
`CI`pipeline, improving an unowned service’s`p99`latency, writing a runbook after an incident, or taking responsibility for a bad deployment even when another team contributed. -
Dive Deep requires technical specificity. Be ready to explain logs, dashboards, traces, database queries, thread dumps, memory profiles, cache behavior, retries, timeouts, race conditions, and why the first plausible root cause was wrong.
-
Customer Obsession for engineers usually maps to reliability, latency, correctness, accessibility, security, or developer experience. Tie work to concrete outcomes: reduced checkout errors from
`1.2%`to`0.3%`, cut`p95`API latency from`900ms`to`220ms`, or reduced support tickets by`35%`. -
Bias for Action is not “move fast and break things.” Strong answers show bounded risk: feature flags, canary deployments, rollback plans, read-only migrations, dark launches, rate limits, or temporary mitigations while a permanent fix is built.
-
Insist on the Highest Standards should include quality mechanisms, not perfectionism. Mention code reviews, test coverage, load testing, static analysis,
`SLO`alerts, backward-compatible APIs, migration validation, or eliminating an entire class of bugs through tooling. -
Invent and Simplify is especially relevant when you reduce operational complexity. Examples: replacing manual deployments with one-click pipelines, consolidating duplicate services, simplifying an API contract, removing a brittle dependency, or turning tribal knowledge into automation.
-
Have Backbone; Disagree and Commit should show respectful escalation using evidence. For SWE, evidence might include load-test results, error-budget burn, security risk, operational burden, or a prototype comparing two designs. Once a decision is made, show you supported execution.
-
Deliver Results stories need constraints. State the deadline, dependency, ambiguity, or resource limit. Then explain prioritization: what you cut, deferred, automated, delegated, or simplified to ship safely without hiding quality risks.
-
Earn Trust is demonstrated through transparency and follow-through. In engineering stories, this can mean posting an incident update, admitting a regression you caused, documenting tradeoffs, giving credit, asking for review early, or making a decision visible in an
`ADR`. -
Measured impact matters even when the story is behavioral. Use engineering metrics such as
`MTTR`,`MTTD`, deployment frequency, rollback rate,`p95`/`p99`latency, error rate, availability, test flakiness, build time, memory usage, CPU utilization, or defect escape rate. -
Follow-up readiness is critical. Interviewers often ask: “What exactly did you do?”, “What data did you inspect?”, “Who disagreed?”, “What would you do differently?”, “How did you know it worked?”, and “Was this really your contribution?”
Worked example
For “Answer Dive Deep and Ownership in LP interview”, a strong candidate should frame the story in the first 30 seconds as a production or near-production problem with measurable customer or operational impact. For example: “I’ll use a story where our order-status API had intermittent `5xx` spikes after a deployment; I was not the original owner, but I was on call and drove root cause, mitigation, and prevention.” Clarify the scale briefly: request volume, severity, affected customers, and whether there was an `SLA` or `SLO` at risk.
Organize the answer around four pillars: first, detection and triage; second, technical investigation; third, mitigation and communication; fourth, long-term prevention. In the investigation pillar, go beyond “I checked logs” and name the actual reasoning path: comparing deploy timestamps with error spikes, isolating one dependency, finding retry amplification, verifying with traces, and reproducing under load. A good tradeoff to flag is mitigation versus root cause: “I rolled back first to stop customer impact, then used the failed build in staging to preserve evidence and continue debugging.”
The result should include numbers: `5xx` rate dropped from `8%` to baseline, `MTTR` was `37 minutes`, and a follow-up change reduced similar incidents by adding timeout budgets, circuit breakers, and an alert on dependency saturation. Close with learning: “If I had more time, I would have added pre-production load tests for retry storms earlier and documented a dependency-failure checklist for the on-call rotation.” This shows you owned both the immediate fix and the systemic improvement.
A second angle
For “Answer Amazon-style leadership deep dives”, the same storytelling discipline applies, but the interviewer may focus less on one incident and more on decision-making, conflict, or influence. Suppose the story is about disagreeing with a proposed synchronous service call in a checkout path. The framing should emphasize the engineering tradeoff: synchronous simplicity versus availability risk, latency budget, and blast radius. Instead of centering the narrative on logs and root cause, center it on evidence, stakeholder alignment, and commitment after the decision. A strong answer would mention a prototype, `p99` latency estimates, failure-mode analysis, and how you either persuaded the team or committed to the chosen design while adding safeguards.
Common pitfalls
Pitfall: Giving a generic values answer instead of an engineering story.
A weak answer says, “I always take ownership and communicate well.” A stronger answer names the service, the bug, the metric, the decision you made, and the result. Amazon interviewers will keep drilling until they can separate your actual contribution from the team’s general effort.
Pitfall: Over-indexing on heroics and under-indexing on mechanisms.
“I stayed up all night and fixed production” may sound committed, but it can also signal fragile operations. Pair urgency with scalable mechanisms: alerts, runbooks, tests, deployment guards, code ownership, post-incident reviews, and prevention of repeat failures.
Pitfall: Hiding conflict, mistakes, or tradeoffs.
Perfect stories often sound rehearsed and shallow. It is usually better to say, “My first hypothesis was wrong,” “I shipped a mitigation with known limitations,” or “I disagreed with the design because of `p99` latency risk.” Then show how you used data, communicated clearly, and improved the system.
Connections
Interviewers often pivot from behavioral stories into system design, especially around reliability, scalability, and operational excellence. They may also ask for debugging deep dives, code quality tradeoffs, incident management, or cross-team design negotiation. Prepare each story so it can support both a leadership principle discussion and a technical follow-up.
Further reading
-
Amazon Leadership Principles — official source for the principles and the language interviewers use.
-
The Site Reliability Workbook — practical mechanisms for incident response, reliability, alerting, and operational ownership.
-
How Complex Systems Fail — Richard I. Cook — useful mental model for explaining incidents without oversimplifying root cause.
Practice questions
Coding & Algorithms

What's being tested
Array/string counting questions test whether you can turn brute-force comparisons into linear or near-linear passes using prefix/suffix state, two pointers, hash maps, and frequency vectors. Interviewers are probing correctness under duplicates, zeros, negative values, collisions, and boundary cases—not just happy-path O(n) code.
Patterns & templates
-
Prefix/suffix accumulation — compute left-to-right and right-to-left products/counts in
O(n)time; avoid division and handle zeros explicitly. -
Two-pointer scan on sorted arrays — move
l/rbased on sum comparison; skip duplicate values when returning unique pairs. -
Frequency map counting — use
dict,HashMap, orCounterfor character/item counts; compare vectors inO(k)wherekis alphabet size. -
Fixed alphabet arrays — prefer
int[26]orint[128]for lowercase/ASCII strings; faster and simpler than hash maps when domain is bounded. -
Partition state tracking — maintain left/right distinct-character counts as a split moves; update counts carefully when a frequency reaches zero.
-
Modulo reasoning — maximize distinct remainders by tracking used residues in a set; remember residues range from
0tok - 1. -
Hash map internals — know hashing, buckets, collisions, load factor, resizing, and worst-case
O(n)lookup; mention concurrency concerns when relevant.
Common pitfalls
Pitfall: Using division in product-except-self fails when zeros appear and may violate the stated constraint.
Pitfall: Returning duplicate pairs from a sorted array because you advance pointers but do not skip repeated values.
Pitfall: Treating hash map operations as always
O(1)without acknowledging collisions, resizing cost, and adversarial keys.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions

What's being tested
Dynamic programming here means recognizing overlapping subproblems, defining a compact state, and proving the recurrence before coding. Interviewers are probing whether you can move between top-down memoization, bottom-up tabulation, graph ordering, and string segmentation without double-counting or exponential blowups.
Patterns & templates
-
Top-down memoization with
`dfs(i, remaining)`or`dfs(index)`— cache states in`dict`; usually reduces exponential recursion toO(states * transition_cost). -
Subset-sum/count DP — transform target-sign problems into
sum(P) = (total + target) / 2; handle parity, negative targets, and zeros carefully. -
Word break DP —
dp[i] = any(dp[j] and s[j:i] in words);O(n^2)substrings, often improved with max word length. -
Trie-guided segmentation — walk forward from each valid
`i`through a`Trie`; avoids checking impossible prefixes and reduces wasted substring hashing. -
DAG dynamic programming — process nodes in topological order; for bounded path length use
dp[steps][node]or rolling arrays forO(V)space. -
Concatenated words template — sort by length, build a
`set`incrementally, and run word-break while preventing the whole word from counting as itself. -
Counting vs existence —
sum(...)for number of ways,any(...)for feasibility,parent/backtracking arrays for producing actual segmentations.
Common pitfalls
Pitfall: Treating zeros like normal numbers in target-sum counting; each zero doubles the number of valid expressions.
Pitfall: Using plain recursion for word segmentation or DAG paths; without memoization, repeated suffixes or subpaths explode exponentially.
Pitfall: Returning
Truefor a concatenated word because it exists in the dictionary; require at least two smaller component words.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions

What's being tested
These problems test greedy choice, priority-queue invariants, and efficient array reasoning under ordering constraints. Interviewers are probing whether you can replace brute-force scans or pairings with O(n log n) or O(n) structures, justify correctness, and handle duplicates, empty inputs, and boundary cases cleanly.
Patterns & templates
-
Monotonic stack for nearest-smaller relationships —
O(n)time,O(n)space; decide strict<versus<=before coding duplicates. -
Fenwick tree or segment tree for rightmost-smaller queries — coordinate-compress values, store max index, query values
< a[i]inO(log n). -
Two heaps for streaming median — max-heap lower half, min-heap upper half; rebalance after every
addNum()to keep sizes within one. -
Greedy resource allocation — sort counts/capacities, consume smallest feasible requirement first; prove exchange argument, not just “seems optimal.”
-
Heap scheduling template — sort events by start time, push active candidates into
heapq, pop expired or highest-priority item; usuallyO(n log n). -
Pairing optimization — sort both sides or sort by constraint then use a heap; watch whether objective is max sum, min operations, or feasibility.
-
Range-update minimization — reason on adjacent differences, not full arrays; difference arrays often turn repeated interval operations into linear accounting.
Common pitfalls
Pitfall: Treating “nearest smaller” and “rightmost smaller” as the same problem; the former is usually stack-based, the latter often needs indexed value queries.
Pitfall: Using Python
heapqas a max-heap without negating priorities, or forgetting tie-breaking when equal priorities affect deterministic output.
Pitfall: Giving a greedy algorithm without a correctness argument; state the invariant or exchange proof before moving to implementation details.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions