Airbnb Software Engineer Interview Prep Guide
Everything Airbnb actually asks Software Engineer candidates — concept walkthroughs, worked examples, and the real interview questions, drawn from candidate reports. Free to read.
Last updated

Technical Screen
System Design

What's being tested
This tests whether you can design a low-latency search API that combines correctness-heavy interval logic with scalable backend architecture. Airbnb cares because split stays expand bookable inventory: if no single listing is available for an entire trip, two listings may satisfy the trip with one checkout/check-in boundary. The interviewer is probing for API clarity, availability data modeling, efficient pair generation, deterministic ordering, pagination, deduplication, caching, and edge-case correctness. A strong Software Engineer answer should connect algorithmic interval reasoning to production constraints like `p99` latency, index freshness, and stable results across pages.
Core knowledge
-
Split stay means a guest’s requested trip
[start_date, end_date)is divided at a boundarysplit_date, producing two contiguous stays: listing A covers[start_date, split_date)and listing B covers[split_date, end_date). Treat dates as half-open intervals to avoid off-by-one checkout/check-in bugs. -
Availability intervals should be modeled as normalized half-open ranges, for example
{listing_id, available_start, available_end}whereavailable_start < available_end. A listing can satisfy a segment ifavailable_start <= segment_startandavailable_end >= segment_end. -
Minimum-stay and maximum-stay rules must be applied per listing segment, not just to the full trip. For a split at
s, nights are and ; validatemin_nights <= n_i <= max_nightsfor each listing. -
Candidate generation usually starts by finding listings available for the left segment and listings available for the right segment, then joining them. If there are left candidates and right candidates, naive pairing is , so ranking cutoffs and filters matter.
-
Interval indexing can use sorted range indexes, segment trees, interval trees, or search-engine filters. In a relational store like
`Postgres`, range types plus`GiST`indexes can support overlap/containment queries; in a search service, precomputed availability bitsets by date can make containment checks fast. -
Bitset availability is powerful for bounded booking horizons, such as 365–540 days. Represent each listing’s available nights as a bit vector; checking whether a segment is fully available becomes a masked comparison. This trades storage and update complexity for fast query-time filtering.
-
Deduplication depends on whether pair order matters. For a fixed split,
(A first, B second)is not equivalent to(B first, A second)because the stay order differs. But if the API aggregates across multiple split dates, dedupe by canonical keys like(first_listing_id, second_listing_id, split_date)and excludeA == Bunless single-listing fallback is allowed. -
Deterministic pagination requires a stable sort key, not offset over a changing result set. Use cursor pagination with a tuple such as
(rank_score, first_listing_id, second_listing_id, split_date)and encode the last seen tuple. Results should be reproducible for the same request and index snapshot. -
Ranking should be separated from correctness. First generate valid pairs, then rank by business-neutral technical factors such as price, distance between listings, review quality, availability confidence, or listing score. In a SWE interview, focus on where ranking plugs in, not deep ML model design.
-
Caching works best at multiple levels: availability segment lookups, listing metadata fetches, and final query responses for popular destinations/date windows. Cache keys must include location, dates, guests, filters, currency, and versioned availability index state; stale availability must be bounded because booking correctness is user-visible.
-
Freshness and consistency are central tradeoffs. Search can use an eventually consistent availability index for speed, but the booking flow must revalidate both listing segments against the source of truth before checkout. Phrase this as “fast search, authoritative confirmation.”
-
API validation should reject invalid ranges, impossible trip lengths, unsupported guest counts, too-wide date windows, malformed cursors, and filters that exceed service limits. Return structured errors with stable codes like
`INVALID_DATE_RANGE`,`CURSOR_EXPIRED`, or`NO_VALID_SPLIT_BOUNDARY`.
Worked example
For Design API for split-stay combinations, a strong candidate should first clarify the contract: “Are we returning pairs for one explicit split date or should the service consider all valid split dates between check-in and checkout? Do listings need to be geographically close? Should the same listing be excluded? What freshness guarantee is expected before booking?” Then declare assumptions: trip dates are half-open, the search horizon is bounded to about one year, and final booking revalidates availability.
The answer can be organized around four pillars. First, define the HTTP interface: `GET /v1/split-stays/search?location_id=...&checkin=...&checkout=...&guests=...&limit=...&cursor=...`, returning pairs with first_listing, second_listing, split_date, segment prices, and a next_cursor. Second, describe the data model: listing metadata, normalized availability intervals or bitsets, rule constraints, and denormalized search documents. Third, explain the query flow: enumerate valid split dates, fetch candidates for left and right segments, join under constraints, score, dedupe, and paginate. Fourth, cover production concerns: caching, snapshot-aware cursors, index freshness, and authoritative revalidation.
One explicit tradeoff to flag is precompute versus query-time generation. Precomputing all possible pairs explodes roughly as listings squared per market and per date window, while query-time generation may be expensive for large markets. A balanced design precomputes per-listing availability/search indexes, then generates top pair candidates at query time with cutoffs. Close by saying: “If I had more time, I’d discuss geo-distance constraints, price consistency, and how to degrade gracefully when candidate counts are huge.”
A second angle
For Generate split-stay pairs efficiently, the same concept shifts from service design to algorithm design. Instead of starting with endpoints and caches, start with sorted availability intervals and ask how to avoid scans over all listings. One approach is to iterate possible split dates, build left-eligible and right-eligible candidate lists using interval containment checks, then generate only top-ranked or filtered cross-products. If the input is sparse intervals, an interval tree or sorted sweep over starts/ends can reduce repeated scans. The key transfer is the same: correctness comes from half-open interval containment, while scalability comes from bounding candidate generation before pair enumeration.
Common pitfalls
Pitfall: Treating availability as simple date overlap.
Overlap is insufficient: a listing available for [Jan 1, Jan 5) does not satisfy a requested segment [Jan 3, Jan 8). Say “containment” explicitly and use half-open interval notation; it signals you understand checkout-date semantics.
Pitfall: Designing pagination with
`offset`after ranking dynamic results.
Offset pagination can skip or duplicate pairs when availability changes or ranking scores shift. A better answer uses cursor pagination over a deterministic composite sort key and, ideally, ties the cursor to an index snapshot or version.
Pitfall: Jumping straight to “use
`Elasticsearch`” without explaining pair generation.
A search engine can filter listings, but it does not automatically solve split-boundary enumeration, cross-product explosion, deduplication, or booking-time revalidation. Lead with the algorithm and data model, then mention search infrastructure as an implementation choice.
Connections
Interviewers may pivot from here to calendar availability systems, range indexing, cursor-based pagination, search ranking architecture, or booking consistency. Be ready to discuss how final reservation creation prevents double booking, how availability updates invalidate cached search results, and how to cap combinatorial explosion in large markets.
Further reading
-
Designing Data-Intensive Applications — Excellent background on indexes, caching, consistency, and tradeoffs in read-heavy distributed systems.
-
PostgreSQL Range Types Documentation — Useful reference for thinking precisely about interval containment and indexed range queries.
-
Elasticsearch Search After — Practical model for stable cursor-style pagination over ranked search results.
Practice questions
-
Multi-Channel Notifications And Watchlists — covered in depth under Onsite below.
-
Wallets, Payments, And Refund Ledgers — covered in depth under Onsite below.
-
Fault Tolerance, Idempotency, And Concurrency Control — covered in depth under Onsite below.
Coding & Algorithms
-
Connect Four And Grid Game Engines — covered in depth under Take-home Project below.
-
Graph Search, State Space, And Path Optimization — covered in depth under Onsite below.
-
Dynamic Programming, Scheduling, And Set Cover — covered in depth under Onsite below.
-
String Processing, Parsing, And Output Formatting — covered in depth under Onsite below.
Behavioral & Leadership
- Product Sense, Behavioral Leadership, And Values — covered in depth under Onsite below.
Onsite
Coding & Algorithms
- Connect Four And Grid Game Engines — covered in depth under Take-home Project below.

What's being tested
These problems test graph traversal, state-space modeling, and path optimization under constraints. You need to choose between DFS + memoization, BFS over encoded states, and DAG dynamic programming, then justify correctness, reconstruction, and O(V + E)-style complexity.
Patterns & templates
-
DFS + memoization for longest decreasing grid paths —
dfs(r,c)returns best suffix;O(R*C)states, avoid revisiting recomputation. -
Topological DP on DAGs — process nodes in topo order; update
dp[v] = max(dp[v], dp[u] + reward - cost). -
BFS over state space — encode
(row, col, keysMask)for shortest key collection; first time reaching full mask is optimal. -
Bitmask state encoding — map keys to bits using
1 << k; visited becomesvisited[r][c][mask], oftenO(R*C*2^K). -
Path reconstruction — store
parent[state] = prevStateornext[node]; reconstruct after DP/BFS without polluting scoring logic. -
Grid graph idiom — use
dirs = [(1,0),(-1,0),(0,1),(0,-1)]; bounds-check before height, wall, lock, or visited checks. -
Score vs distance distinction — BFS minimizes unweighted steps; weighted DAG maximization needs DP, not greedy local best or plain BFS.
Common pitfalls
Pitfall: Treating
(r, c)as visited in key-collection problems is wrong; the same cell with different keys is a different state.
Pitfall: Using DFS without memoization on decreasing-path grids can explode exponentially despite the grid having only
R*Cmeaningful states.
Pitfall: Forgetting to prove acyclicity before DAG DP; if cycles exist, longest/max-score path may be undefined or require different algorithms.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions

What's being tested
These problems test dynamic programming for combinatorial optimization, especially when brute force over subsets or schedules is too large. Expect to recognize weighted interval scheduling, knapsack/subset-cover search, and memoized multidimensional state while explaining correctness, tie-breaking, and complexity.
Patterns & templates
-
Weighted interval scheduling — sort jobs by end time; use
bisect_rightto find previous compatible job; recurrencedp[i] = max(dp[i-1], profit[i] + dp[p(i)]). -
Inclusive/exclusive interval handling — clarify whether a job ending at
tcan precede one starting att; this changesbisect_rightvsbisect_left. -
Set cover via bitmask DP — map required services to bits; update
dp[mask | providerMask] = minCost; works well when services 20–25. -
Backtracking with pruning — enumerate combinations for small
n; sort by cost, prune when current cost exceeds best, and use deterministic tie-breaking. -
Knapsack-style capacity DP — for capacity-constrained property selection, track reachable sums/counts; reconstruct minimal set and handle exact vs at-least constraints.
-
Memoization on multidimensional state — use
@lru_cacheover(idx, remainingA, remainingB, remainingC)or normalized tuples; prune dominated states aggressively. -
Cost/profit overflow awareness — use 64-bit integers conceptually; Python is safe, but mention
long long/int64in typed languages.
Common pitfalls
Pitfall: Treating interval scheduling as greedy by highest reward or shortest duration; weighted scheduling needs DP because local choices can block better combinations.
Pitfall: Forgetting deterministic tie-breaking when multiple minimal covers or property sets exist; define order before coding.
Pitfall: Using exponential subset enumeration when the intended constraint supports bitmask DP, binary search, or memoization.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions

What's being tested
This tests string parsing, canonicalization, and exact output construction under edge cases, plus efficient lookup-based matching for palindrome pair variants. Interviewers look for clean decomposition into helpers like isPalindrome, splitQuery, formatRow, and for reasoning about correctness, complexity, and malformed inputs.
Patterns & templates
-
Palindrome pair lookup — store reversed words in a
HashMap; test every split withisPalindrome(prefix/suffix); targetO(total_chars * word_len). -
Trie-based palindrome matching — insert reversed words into a trie, carrying palindrome-suffix indices; improves repeated prefix matching but increases implementation risk.
-
Delimiter-aware parsing — scan character-by-character instead of blind
split('&')when quotes, empty values, or escaped delimiters may appear. -
Type inference parser — normalize values in order: quoted string, boolean flag, integer/float, raw string; document ambiguous cases like
001. -
Duplicate-key accumulation — map
key -> valuefor first occurrence, then promote tokey -> list; keep insertion order if output is tested. -
Fixed-width formatting — compute column widths first, then render with
ljust/padding; verify borders, trailing spaces, and empty-cell behavior exactly. -
State encoding — for key/path variants, represent visited state as
(position, keyMask); BFS gives shortest path inO(V * 2^K + E * 2^K).
Common pitfalls
Pitfall: Treating palindrome pairs as only
word + reverse(word)misses split cases like"lls"+"s"and empty-string combinations.
Pitfall: Using naive
split('=')orsplit('&')breaks quoted values, missing values, boolean flags, and duplicate parameters.
Pitfall: Output-formatting problems often fail on whitespace, not logic; compare exact strings and test single row, empty input, and uneven columns.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions
System Design

What's being tested
These prompts test whether you can design a high-scale distributed notification workflow that turns product events into reliable, personalized deliveries across email, push, SMS, and in-app surfaces. For Airbnb, this matters because notifications often sit on the critical path of booking, host response, fraud, cancellations, price drops, and availability alerts; missed, duplicated, or late messages directly harm trust. The interviewer is probing for system decomposition, event-driven architecture, data modeling, idempotency, rate limiting, preference handling, retry semantics, and operational thinking around `p99` latency and provider failures. Strong answers balance correctness and reliability without over-engineering every component into exactly-once fantasy.
Core knowledge
-
Event-driven architecture is the default shape: product services emit events such as
`ReservationConfirmed`,`ListingAvailable`, or`MessageReceived`to a durable log like`Kafka`,`Pulsar`, or`Kinesis`; notification workers consume, fan out, template, and deliver. This decouples product write paths from slow external providers. -
Notification intent should be modeled separately from delivery attempt. An intent says “user
`U`should receive notification`N`for event`E`”; attempts track channel-specific sends like email, push, and SMS. This separation makes retries, auditing, and multi-channel fallback much easier. -
Idempotency is non-negotiable. Use a deterministic key such as
`hash(user_id, event_id, notification_type, channel)`and enforce uniqueness in`Postgres`,`DynamoDB`, or`Redis`before sending. Stripe-style`Idempotency-Key`semantics prevent duplicate messages when consumers retry after timeouts. -
At-least-once delivery is realistic; exactly-once delivery is usually a presentation-layer illusion. Message brokers and workers may redeliver, so design consumers to be idempotent. If the broker guarantees ordered partitions, ordering only holds per partition key, not globally.
-
Fanout strategy depends on scale. For small audiences, perform fanout-on-write: create delivery rows immediately. For huge campaigns or broad watchlist matches, use batch workers or topic-based fanout. A rough capacity estimate is where peak factor is often
`5x`to`20x`. -
User preferences and policy checks should be central, not scattered across product teams. Store per-user, per-channel, per-notification-type settings:
`marketing_email=false`,`booking_sms=true`,`quiet_hours=22:00-08:00`,`locale=en-US`. Transactional notifications may bypass some marketing preferences but still need channel validity and legal constraints. -
Rate limiting applies at multiple levels: per user, per notification type, per channel, and per provider. Use token buckets in
`Redis`for “no more than 3 promotional pushes per day” and provider-level throttles for`SendGrid`,`Twilio`,`APNs`, or`FCM`quotas. Decide whether excess traffic is dropped, delayed, or downgraded. -
Retry policy should distinguish transient from permanent failures. HTTP
`429`,`500`, and network timeouts get exponential backoff with jitter; invalid phone numbers, hard email bounces, and revoked push tokens should mark the channel unavailable. Send retries to a dead-letter queue after bounded attempts. -
Template rendering needs versioning and localization. Store templates by
`template_id`,`version`,`locale`, and channel, with required variables validated before enqueueing. Rendering can happen before queueing for auditability or at send time for fresher data; the tradeoff is reproducibility versus freshness. -
Watchlist matching often involves date ranges and availability intervals. A rental watchlist might store
(user_id, location_id, start_date, end_date, guests, price_max). Matching can be implemented with inverted indexes by location/date bucket,`Postgres``GiST`indexes over range types,`Elasticsearch`filters, or precomputed daily buckets when query patterns are simple. -
Time zones and DST are common traps. Store instants in UTC, but interpret user-facing dates and quiet hours in the listing or user time zone. A “check-in date” is not the same as a UTC timestamp; date-range availability should use local calendar semantics to avoid off-by-one errors near DST transitions.
-
Observability should cover the full funnel:
`events_received`,`intents_created`,`eligible_after_preferences`,`queued`,`sent_to_provider`,`provider_accepted`,`delivered`,`opened`,`failed`, and`deduped`. Track`p50/p95/p99`enqueue-to-send latency, DLQ size, retry counts, and provider error rates by channel.
Worked example
For Design rental watchlist and notification system, a strong candidate starts by asking clarifying questions: are users watching exact listings or flexible criteria, how fresh must alerts be, how many active watchlists exist, and are notifications transactional or promotional? Then they declare assumptions, for example: 50 million active watchlists, availability changes are event-driven, users can specify location, dates, guests, price, and preferred channels, and alerts should usually arrive within a few minutes.
The answer can be organized around four pillars: data model, matching pipeline, notification pipeline, and correctness/operations. The data model includes `Watchlist(user_id, criteria, date_range, timezone, status)` and an availability source emitting `ListingAvailabilityChanged` events. The matching pipeline consumes availability changes, looks up candidate watchlists using location and date indexes, filters by guests/price/rules, and emits `WatchlistMatched` intents. The notification pipeline checks preferences, deduplicates by (watchlist_id, listing_id, available_date_range, notification_type), renders templates, and sends through push/email/SMS workers.
A key tradeoff to flag is event-driven matching versus periodic batch scans. Event-driven matching gives lower latency and lower average work, but it is more sensitive to missed events and requires replay/backfill from the event log; batch scans are simpler and safer for reconciliation but can be expensive and less timely. A polished close would say: “If I had more time, I’d add replay tooling, DLQ reprocessing, multi-region failover, and a reconciliation job that verifies watchlists against current availability to catch missed events.”
A second angle
For Design a multi-channel notification system, the watchlist-specific matching layer becomes less important, and the generic delivery platform becomes the center of the design. The core abstraction is a `NotificationRequest` API used by many product services, with fields such as `recipient_id`, `notification_type`, `event_id`, `priority`, `template_variables`, and `allowed_channels`. The interviewer is likely to push harder on channel fallback, provider integrations, preference enforcement, and global rate limits. You can reuse the same principles: durable queues, idempotent consumers, centralized preferences, per-channel delivery attempts, retries with DLQs, and full-funnel observability. The main constraint shift is from “how do we find who should be notified?” to “how do we reliably deliver many different notification types without spamming or duplicating users?”
Common pitfalls
Pitfall: Treating notifications as a synchronous API call from the product service.
A tempting answer is “booking service calls email/SMS providers directly after a booking.” That couples user-facing latency and availability to third-party providers, makes retries unsafe, and spreads preference logic across services. A better answer persists an intent, publishes to a durable queue, and lets dedicated workers handle delivery asynchronously.
Pitfall: Claiming exactly-once delivery without explaining idempotency.
Interviewers often probe this by asking what happens if a worker sends an SMS but crashes before committing the offset. The correct framing is that external side effects are not truly exactly-once; you approximate correctness with idempotency keys, delivery records, provider request IDs when supported, and dedupe windows.
Pitfall: Skipping edge cases around user experience and correctness.
A shallow design may cover `Kafka` and workers but ignore quiet hours, unsubscribes, invalid push tokens, DST, duplicate watchlist matches, provider throttling, and delayed retries. The stronger answer explicitly separates eligibility, dedupe, send attempts, and user-visible audit trails, then names the failure modes each layer handles.
Connections
Interviewers may pivot from here into rate limiter design, feed fanout, event sourcing, distributed job queues, or calendar availability modeling. For Airbnb-style systems, expect follow-ups on multi-region reliability, `Kafka` partitioning strategy, API idempotency, and how to debug a spike in duplicate or delayed notifications.
Further reading
-
Designing Data-Intensive Applications — Martin Kleppmann’s chapters on logs, replication, and stream processing are directly relevant to durable notification pipelines.
-
Kafka: a Distributed Messaging System for Log Processing — useful background on append-only logs, partitions, and consumer groups.
-
Stripe API idempotency docs — practical reference for request-level idempotency patterns that apply well to notification sends.
Practice questions

What's being tested
This area tests whether you can design and implement financially correct transaction systems: ledgers that preserve money movement history, refund allocation logic that is deterministic and auditable, and payment workflows that survive retries, partial failures, and currency edge cases. Airbnb cares because bookings involve guest charges, host payouts, service fees, taxes, cancellations, disputes, coupons, credits, and refunds across many payment methods and currencies. The interviewer is probing for disciplined engineering: append-only records instead of mutable balances, explicit invariants, idempotency, ordering rules, and a clear separation between payment processor state and Airbnb’s internal source of truth. Strong answers combine system design with concrete algorithms for allocating refundable amounts across prior payments.
Core knowledge
-
Append-only ledger design is the foundation. Never model money by overwriting a single balance column as the source of truth; store immutable ledger entries with
account_id,currency,amount,direction,transaction_id,created_at, and metadata. Balances are derived as . -
Double-entry accounting means every movement has equal and opposite entries, so total system balance remains conserved. For example, guest payment capture debits
GuestReceivableand creditsAirbnbCash; a refund reverses cash out and guest liability. Invariant: total debits must equal total credits per transaction. -
Idempotency keys prevent duplicate charges, refunds, and ledger postings during retries. Use a client- or workflow-generated key such as
refund_request_id, enforce uniqueness inPostgres, and return the original result on replay.Stripe’s common pattern is: same idempotency key plus same parameters equals same response. -
Refund allocation is usually a deterministic greedy algorithm over refundable payment records. Filter payments with remaining refundable balance, group by method or processor constraints, sort by priority and recency, then allocate
min(remaining_refund, payment_remaining). Complexity is typicallyO(n log n)for sorting orO(n)if inputs are already ordered. -
Refundable balance should be derived from ledger history, not trusted from stale cached fields. For a payment :
Cache it for performance only if you can rebuild and reconcile it from immutable entries. -
Prioritization rules must be explicit and stable. Common rules: refund to original payment method before wallet credit, refund promotional credits last or never as cash, prefer most recent payment first due to processor limits, and respect card-network refund windows. Add a stable tie-breaker like
payment_idto avoid nondeterministic outputs. -
Partial and multiple refunds require transaction-level state machines. A booking might be
PAID -> PARTIALLY_REFUNDED -> REFUNDED, while each payment can independently beCAPTURED,PARTIALLY_REFUNDED,REFUND_PENDING,REFUND_FAILED, orREFUNDED. Do not conflate booking state with payment-instrument state. -
Concurrency control matters when two refund requests race. Use database transactions, row-level locks such as
SELECT ... FOR UPDATE, optimistic version checks, or a serialized workflow engine. The key invariant is that total allocated refunds cannot exceed total refundable amount, even under duplicate requests and retries. -
External payment processors are not the internal ledger. Processor calls can timeout after succeeding, webhooks can arrive out of order, and refunds may settle days later. Record internal intent, call the processor idempotently, then reconcile asynchronous events into ledger entries without double-posting.
-
Holds, authorizations, captures, and payouts are different money states. An authorization reserves funds but is not cash captured; a capture moves money; a hold may reserve wallet balance; a payout sends funds to a host. Model each as separate ledger events or account transitions rather than boolean flags.
-
Multi-currency systems need currency-aware arithmetic. Store amounts in minor units, such as cents, with
currencyon every ledger row. Never sum across currencies without an FX conversion event. If converting, storesource_amount,source_currency,fx_rate,target_amount,target_currency, rounding mode, and rate timestamp. -
Rounding and allocation need deterministic residual handling. If splitting $100.00 across three parties, integer minor-unit math may leave a one-cent remainder. Assign residuals by a documented rule, such as largest remainder method or deterministic ordering, so repeated computations produce identical ledger entries.
Worked example
For Design an Airbnb wallet with holds and payouts, a strong candidate first clarifies scope: “Are we designing the internal wallet ledger, payment processor integrations, or both? Do we need multi-currency? Are wallet balances spendable by guests, payable to hosts, or both? What consistency guarantees are required for double-spend prevention?” Then declare assumptions: one wallet per user per currency, append-only ledger as the source of truth, and Postgres transactions for the core write path.
The answer can be organized around four pillars. First, define the data model: wallet_account, ledger_transaction, ledger_entry, hold, payout, and idempotency_key. Second, describe core flows: add funds, place hold for booking, capture hold, release hold, refund, and host payout. Third, explain correctness mechanisms: double-entry postings, row locks on wallet accounts or holds, idempotent APIs, and reconciliation against processor webhooks. Fourth, cover operational concerns: balance materialization, audit trails, failure recovery, and alerts for ledger imbalance.
A key tradeoff is whether to calculate balances on demand from ledger rows or maintain a materialized available_balance and held_balance. On-demand reads are simpler and safer but can become expensive as ledger rows grow; materialized balances are faster but require strict transactional updates and periodic rebuild jobs. A good answer would choose materialized balances for hot reads while treating the append-only ledger as canonical.
Close by saying: “If I had more time, I’d go deeper on multi-currency FX locking, payout failure handling, and reconciliation dashboards for finance operations.” That shows you understand the boundary between core SWE design and production-grade financial reliability.
A second angle
For Allocate refund across payments, the same concept becomes more algorithmic than architectural. Instead of designing the whole wallet, focus on representing prior payments and refunds, deriving remaining refundable balances, and applying a deterministic ordering rule. A clean implementation would normalize each payment into {payment_id, method_type, captured_amount, refunded_amount, created_at, priority} and produce allocation rows {payment_id, refund_amount}. The main design constraint is correctness under partial refunds: the sum of allocations must equal the requested refund unless funds are insufficient, and no payment can be over-refunded. The interviewer may then extend it into system design by asking how this allocation is persisted, retried, and reconciled with processor responses.
Common pitfalls
Pitfall: Treating balances as mutable fields instead of ledger-derived facts.
A tempting design is a wallet_balance table that increments and decrements directly. That may pass a toy design, but it fails auditability and makes bug recovery difficult. A stronger answer says cached balances are allowed, but immutable ledger entries are canonical and can rebuild every balance.
Pitfall: Hand-waving refund ordering rules as “business logic.”
Refund prioritization is not just a product detail; it affects deterministic execution, testability, and financial correctness. Instead of saying “apply the company’s policy,” define a rule engine or ordered comparator: payment method priority, processor eligibility, recency, remaining balance, and stable tie-breaker.
Pitfall: Ignoring asynchronous processor behavior.
Many candidates assume refund() either succeeds or fails synchronously. In real payment systems, timeout does not imply failure, webhooks can be delayed, and processor state can diverge from internal state. Better answers introduce idempotent processor calls, pending states, reconciliation jobs, and exactly-once ledger posting at the application level using uniqueness constraints.
Connections
Interviewers may pivot from this topic into idempotent API design, distributed transactions and sagas, database isolation levels, event-driven reconciliation, or multi-currency pricing. They may also ask for a coding version of the same domain: implement refund allocation using sorting, grouping, and remaining-balance aggregation.
Further reading
-
Stripe API Idempotent Requests — practical reference for retry-safe financial APIs and request de-duplication.
-
Martin Fowler, Accounting Patterns — useful conceptual grounding for accounts, entries, and posting rules.
-
Designing Data-Intensive Applications — strong background on transactions, consistency, replication, and failure modes relevant to ledgers.
Practice questions

What's being tested
Airbnb-style systems need to stay correct when requests are duplicated, services fail mid-operation, users race for scarce inventory, and background jobs retry hours later. Interviewers are probing whether you can design fault-tolerant APIs, idempotent workflows, and concurrency control mechanisms without hand-waving “the database handles it.” For a Software Engineer, the key skill is choosing where correctness must be strict, where eventual consistency is acceptable, and how to make failure modes observable and recoverable. Expect follow-ups around double-booking, duplicate notifications, retry storms, partial writes, message ordering, and hot-key contention.
Core knowledge
-
Idempotency means the same logical request can be executed multiple times with the same externally visible result. Common API pattern: require an
Idempotency-Key, store(caller_id, key, request_hash, response, status), and return the cached response for safe duplicates. -
At-least-once delivery is the default assumption for queues, webhooks, and retrying clients. If a worker reads from
SQS,Kafka, or a task queue and crashes after side effects but before acking, the job may run again; consumers must deduplicate. -
Exactly-once semantics are usually built from weaker primitives: atomic database writes, unique constraints, idempotency records, and deterministic retry behavior. Avoid promising “exactly once” globally; instead say “effectively once for this side effect under this key.”
-
Optimistic concurrency control uses a version column, compare-and-swap, or conditional update:
UPDATE rows SET version = version + 1 WHERE id = ? AND version = ?. If zero rows update, reload and retry. This works well when conflicts are rare. -
Pessimistic locking uses row locks like
SELECT ... FOR UPDATEinPostgresor distributed leases inRedis/ZooKeeper. It is simpler for scarce inventory but can reduce throughput, create lock waits, and fail under hot-key traffic. -
Unique constraints are one of the strongest correctness tools. For a waitlist, a unique index on
(listing_id, date, user_id)prevents duplicate joins; for notifications,(event_id, user_id, channel)prevents duplicate sends from concurrent workers. -
Retry policy should classify errors: retry
5xx, timeouts,429, transient network resets; do not blindly retry400, validation failures, auth errors, or non-idempotent payment-like side effects. Use exponential backoff with jitter:delay = random(0, min(cap, base * 2^attempt)). -
Deadlines and cancellation prevent retry wrappers from exceeding user-visible latency budgets. Track both per-attempt timeout and total deadline: if
p95service latency is200msand caller budget is1s, three attempts with backoff may already be too expensive. -
Transactional outbox solves the “database commit succeeded but message publish failed” problem. Write business state and an
outbox_eventsrow in onePostgrestransaction; a relay publishes toKafka/queue and marks the row sent idempotently. -
Ordering guarantees are scoped, not global. A group chat may need per-conversation ordering using a monotonically increasing
message_seq; notifications may only need per-user de-dupe, while waitlists need FIFO per(listing_id, date). -
Hot partitions happen when many users target the same listing, chat, or notification campaign. Mitigations include sharding by composite key, queueing per entity, admission control, rate limits, caching reads, and moving expensive fan-out to async workers.
-
Observability must expose correctness and resilience signals: retry rate, duplicate suppression count, lock wait time, conflict retry count, queue lag,
DLQdepth, notification delivery status, idempotency cache hit rate, andp99latency under contention.
Worked example
For Design a hot-listing waitlist API, a strong first 30 seconds would clarify whether users can join multiple dates, whether ordering must be strict FIFO, whether a user can cancel, and what happens when inventory opens. They should declare an assumption like: “Correctness matters more than raw throughput for a single listing-date; I’ll design per (listing_id, date) ordering and prevent duplicate joins.” The answer can be organized around four pillars: API shape, data model, concurrency control, and async notification/retry handling. The API might expose POST /listings/{id}/dates/{date}/waitlist with an Idempotency-Key, returning the existing waitlist position if the same user retries. The data model should include a waitlist_entries table with unique (listing_id, date, user_id), a created_at or sequence for ordering, and status values like WAITING, OFFERED, EXPIRED, BOOKED, CANCELLED. For concurrency, use a database transaction and unique constraints for joins; when inventory opens, select the next eligible entries with FOR UPDATE SKIP LOCKED or a per-listing-date worker to avoid two users receiving the same offer. A tradeoff to flag explicitly: strict FIFO with database locks is simple and correct, but hot listings may need partitioned queues or single-flight processing per listing-date to reduce lock contention. Close by saying that with more time, you would cover expiration timers, notification retries, abuse/rate limits, and operational metrics like duplicate-join attempts and offer acceptance latency.
A second angle
For Design a multi-channel notification system, the same ideas apply, but scarce inventory is replaced by side-effect control across email, push, and SMS vendors. Here the main danger is duplicate or out-of-order sends: a retrying worker might send two SMS messages unless the system records (event_id, user_id, channel) before or atomically with dispatch. You would use an idempotent notification request API, persist a notification intent, fan out to channel-specific jobs, and make each sender idempotent against provider retries and internal retries. The consistency requirement is usually weaker than waitlist FIFO: eventual delivery is acceptable, but duplicate suppression, rate limiting, user preferences, and auditability are critical. A good tradeoff discussion compares synchronous sends for low latency against async queue-based sends for reliability and backpressure.
Common pitfalls
Pitfall: Treating retries as a universal fix.
A weak answer says “if the request fails, retry three times” without defining retryable errors, deadlines, jitter, or idempotency. A stronger answer separates transient failures from permanent failures, applies exponential backoff with jitter, and explains how duplicate attempts are made safe.
Pitfall: Assuming database transactions solve cross-system side effects.
A transaction can atomically update Postgres, but it cannot atomically send an email, push to a vendor, and commit local state unless you add a pattern like transactional outbox or idempotent send records. Say exactly where atomicity ends and how recovery jobs reconcile incomplete work.
Pitfall: Over-optimizing scalability before proving correctness.
For hot waitlists or chat ordering, jumping immediately to sharded distributed queues can obscure the core invariant. Start with the invariant: “one offer per inventory unit,” “one message sequence per conversation,” or “one notification per event-channel-user,” then scale the implementation while preserving it.
Connections
Interviewers may pivot from this topic into API design, database schema design, distributed queues, rate limiting, or real-time fan-out. They may also ask for failure-mode analysis: what happens if the worker crashes after sending, if the database commit succeeds but the queue publish fails, or if two clients submit the same action concurrently.
Further reading
-
Designing Data-Intensive Applications — Martin Kleppmann’s book is the best practical foundation for transactions, replication, ordering, and exactly-once tradeoffs.
-
Exponential Backoff and Jitter — AWS’s canonical explanation of why jitter prevents synchronized retry storms.
-
Stripe Idempotent Requests — clear production example of idempotency keys, cached responses, and duplicate request handling.
Practice questions
Behavioral & Leadership
What's being tested
Airbnb is probing whether a Software Engineer can connect engineering work to user outcomes while still owning technical quality: architecture, reliability, rollout safety, debugging, and cross-functional execution. Strong answers show you can navigate ambiguity, align stakeholders, make explicit tradeoffs, and deliver measurable impact without overclaiming product strategy ownership. Interviewers are listening for how you reason under constraints: ambiguous requirements, slipping ETAs, conflicting priorities, legacy systems, global teams, and consumer-facing risk. The best responses combine behavioral structure with technical depth: what you built, why the design was appropriate, what failed, how you communicated, and what changed for users or the business.
Core knowledge
-
Use a structured story format, usually
STARorCARL: Situation, Task, Action, Result; or Context, Action, Result, Learning. For senior-leaning answers, emphasize decision-making: alternatives considered, who disagreed, what data changed your mind, and what you would do differently. -
Tie engineering impact to user-facing metrics, not vague “improved experience” claims. Good SWE metrics include
p95latency,p99latency, error rate, conversion completion rate, crash-free sessions, support ticket volume, rollback count, deploy frequency, and operational load. A simple impact framing is: -
Separate product intent from engineering responsibility. You can say, “The PM owned prioritization and success criteria; I translated those into API contracts, rollout plan, reliability targets, and technical tradeoffs.” This avoids sounding like you usurped PM work while still showing product sense.
-
Describe architecture at the right altitude: clients, edge/API layer, services, data stores, async jobs, caches, observability, and rollout mechanisms. For example:
iOS/Androidclient →GraphQLor REST API → service layer →Rediscache →MySQL/Postgres→ async worker queue. Then zoom into the one design choice that mattered. -
Make tradeoffs explicit using dimensions like latency, consistency, correctness, complexity, migration risk, privacy, cost, and developer velocity. A strong phrase is: “We chose eventual consistency because booking status could tolerate a 1–2 second delay, but payment authorization could not.”
-
Know common consumer-product engineering constraints: mobile backward compatibility, partial rollouts, localization, accessibility, network retries, idempotency, cache invalidation, and degraded modes. Airbnb-like products are global and transactional; failures may affect bookings, trust, safety, payments, host income, or guest travel plans.
-
Use safe-launch mechanics: feature flags, dark launches, canaries, staged rollout by percentage, kill switches, shadow reads, dual writes, and monitoring dashboards. For high-risk changes, mention rollback criteria such as “rollback if
5xxrate exceeds 0.5% for 10 minutes orp99latency regresses by 20%.” -
Handle cross-team delivery with contracts and risk registers. When multiple teams own services, align on API schemas, ownership boundaries, dependency dates, test environments, escalation paths, and fallback behavior. A lightweight written plan in
RFCform prevents “I thought your team owned that” failures. -
When ETAs slip, communicate early with options, not excuses. Give the new risk, cause, impact, mitigations, and choices: reduce scope, extend timeline, add staffing, accept risk, or sequence launch. Engineering credibility comes from surfacing uncertainty before it becomes a surprise.
-
For miscommunication, show a closed-loop process: detect mismatch, restate assumptions, write down decisions, confirm owners, and verify with a small artifact such as an API mock, sequence diagram, design doc, or demo. “I clarified” is weaker than “I changed the process so ambiguity did not recur.”
-
Balance leadership with humility. Strong behavioral answers include conflict, mistakes, and learning. Avoid making every stakeholder look unreasonable; instead say, “We optimized for different constraints: the design team prioritized consistency, while my concern was client performance on older devices.”
-
Prepare one flagship project deeply. You should be able to explain the problem, users affected, architecture, scale, failure modes, metrics, rollout, conflict, and lessons. If you cannot draw the system from memory or explain why an alternative was rejected, the story is not ready.
Worked example
For “Describe a flagship project’s architecture and tradeoffs”, start by framing the answer in the first 30 seconds: “I’ll describe a consumer-facing availability service I led, focusing on the user problem, architecture, two major tradeoffs, and measured impact.” Clarify assumptions briefly: scale, latency target, consistency requirements, and what you personally owned. Then organize the answer around four pillars: problem and success metrics, system architecture, tradeoff decisions, and rollout/impact.
A strong skeleton might say: the old flow made guests wait several seconds for accurate availability because the API synchronously queried multiple downstream systems; you redesigned it with a read-optimized availability service backed by Redis plus async invalidation from booking updates. The main tradeoff was freshness versus latency: caching reduced p95 latency from 900 ms to 180 ms, but created risk of stale availability, so you used short TTLs, explicit invalidation on booking events, and a final synchronous validation before checkout. You would mention alternatives, such as querying the source-of-truth every time or precomputing all availability, and explain why they failed due to latency or storage/update complexity.
For behavioral leadership, include the cross-functional piece: PM defined the user pain, data science helped measure funnel impact, mobile teams coordinated API adoption, and your role was the service design, migration plan, and production readiness. Close with outcomes: latency improved, checkout errors did not regress, rollout completed behind a feature flag, and the team created a reusable caching pattern. End with: “If I had more time, I’d add better chaos testing for invalidation failures and a dashboard segmenting stale-cache incidents by market and platform.”
A second angle
For “Address miscommunication and alignment”, the same core skill appears, but the emphasis shifts from architecture depth to ambiguity handling. Instead of leading with system diagrams, lead with the mismatch: for example, design expected a real-time update, backend assumed eventual consistency, and mobile built around a third interpretation. A strong answer shows how you detected the issue early through a prototype, converted verbal assumptions into an RFC, and aligned on an explicit contract: freshness target, loading state, error behavior, and owner for each dependency. The tradeoff still matters, but it should be framed as an alignment tool: “Once we agreed that 5-second freshness was acceptable, we avoided a complex websocket design and shipped a simpler polling approach.” Close by showing a durable process improvement, such as adding API contract reviews before sprint commitment.
Common pitfalls
Pitfall: Giving a PM-style answer with no engineering substance.
A tempting answer is, “I identified a customer pain point, aligned stakeholders, and improved conversion.” That is not enough for a Software Engineer interview. Add the technical mechanism: API design, data model, latency bottleneck, rollback plan, observability, migration, and the tradeoff you personally decided.
Pitfall: Describing conflict as a personality problem instead of a systems problem.
Saying “the other team didn’t communicate” sounds immature and low-agency. A stronger version is: “We lacked a shared source of truth for the API contract, so I wrote a one-page design doc, added examples, and scheduled a 15-minute async review window for teams in different time zones.”
Pitfall: Reporting impact without credible measurement or guardrails.
Claims like “users loved it” or “performance was much better” are weak unless paired with numbers and failure checks. Say what moved, what did not regress, and how you knew: p95 latency, conversion, 5xx rate, crash-free sessions, support tickets, rollout percentage, or on-call pages.
Connections
Interviewers may pivot from this area into system design, especially if your flagship project involves caching, consistency, APIs, migrations, or reliability. They may also probe debugging and incident response, engineering execution, or values-based leadership, asking how you handled disagreement, incomplete data, or a launch risk that affected guests and hosts.
Further reading
-
Designing Data-Intensive Applications — Deepens vocabulary for consistency, caching, replication, and tradeoffs you may reference in flagship architecture stories.
-
The Staff Engineer’s Path — Useful for framing technical leadership, influence without authority, and cross-team execution.
-
Accelerate — Gives concrete language around delivery performance, reliability, deployment frequency, and engineering effectiveness.
Practice questions
Take-home Project
Coding & Algorithms

What's being tested
Grid game engines test clean state modeling, turn validation, and efficient winner detection after each move. The core pattern is checking only lines crossing the last placed token, not rescanning the whole board.
Patterns & templates
-
Board representation — use
grid[rows][cols]plusheights[col];drop(col, player)becomesO(1)placement with bounds checks. -
Last-move win check — after
(r, c), scan four direction pairs: horizontal, vertical, diagonal down, diagonal up. -
Connect-N counting —
count(dir) + count(opposite) + 1 >= n; each move costsO(n)orO(rows + cols)worst case. -
API design — expose
move(col),getWinner(),isDraw(),currentPlayer; reject moves after win or into full columns. -
Generalization template — parameterize
rows,cols, andtarget; validate impossible configs liketarget > max(rows, cols)carefully. -
Testing strategy — cover vertical, horizontal, both diagonals, full-column rejection, draw state, invalid player turns, and win-on-last-move cases.
Common pitfalls
Pitfall: Rescanning the entire board after every move is simple but often misses the intended optimization; anchor detection on the last token.
Pitfall: Diagonal logic is error-prone; define direction vectors once, such as
(1,1)and(1,-1), then reuse symmetric counting.
Pitfall: Forgetting terminal-state behavior leads to bugs where players keep moving after a win or overwrite draw/winner state.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions