Google Software Engineer Interview Prep Guide
Everything Google actually asks Software Engineer candidates — concept walkthroughs, worked examples, and the real interview questions, drawn from candidate reports. Free to read.
Last updated

Technical Screen
Coding & Algorithms
-
String And Sliding Window Algorithms — covered in depth under Take-home Project below.
-
Hash Map Counting And Frequency Analysis — covered in depth under Take-home Project below.
-
Top-K Selection And Order Statistics — covered in depth under Onsite below.
-
Intervals, Line Sweep, And Range Updates — covered in depth under Onsite below.
-
Graph Traversal And Shortest Paths — covered in depth under Onsite below.
-
Trees And Hierarchical Structures — covered in depth under Onsite below.
-
Dynamic Programming, Backtracking, And Combinatorial Search — covered in depth under Take-home Project below.
-
Mutable Data Structure Design — covered in depth under Take-home Project below.
Software Engineering Fundamentals
- Streaming, Large Inputs, And External Memory — covered in depth under Onsite below.
System Design
-
Distributed Systems Consistency, Reliability, And Observability — covered in depth under Onsite below.
-
Storage, Indexing, APIs, And Secure Execution — covered in depth under Onsite below.
Behavioral & Leadership
- Behavioral Leadership, Collaboration, And Ambiguity — covered in depth under Take-home Project below.
Onsite
Coding & Algorithms
-
String And Sliding Window Algorithms — covered in depth under Take-home Project below.
-
Hash Map Counting And Frequency Analysis — covered in depth under Take-home Project below.

What's being tested
These problems test order statistics and top-K selection: finding the smallest, largest, median, or highest-ranked items without fully sorting everything. Interviewers expect you to choose between heap, quickselect, two-pointer, Trie augmentation, or streaming median based on input size, update pattern, and ordering rules.
Patterns & templates
-
Min-heap frontier expansion for sorted combinations — push
(sum, i, j), pop K times; typicalO(k log k)with visited-pair deduping. -
Fixed-size max/min heap for top-K — maintain K best items in
O(n log k)time; define comparator carefully for ties. -
Quickselect for unordered arrays — average
O(n)time, worstO(n^2)unless randomized; returns partitioned top-K, not sorted top-K. -
Two-heaps streaming median — max-heap lower half, min-heap upper half; rebalance sizes so median query is
O(1), insertO(log n). -
Augmented Trie top-K — store per-node top suggestions or frequency maps; prefix lookup is
O(len(prefix)), but updates may costO(len(word) * log k). -
Multi-key ranking comparator — order by frequency, distance, timestamp, lexicographic key; make comparator transitive and match required ascending/descending semantics.
-
Approximate quantiles under memory limits — use bucket histograms, reservoir sampling, or sketches when exact median storage is impossible.
Common pitfalls
Pitfall: Fully sorting
nitems for every query givesO(n log n)whenO(n log k),O(k log n), or cached metadata is expected.
Pitfall: Ignoring duplicate states in k-smallest-pairs can push
(i, j)through multiple paths and inflate runtime or output duplicates.
Pitfall: Treating median as a batch problem when the interviewer asks for streaming updates misses the required online data-structure invariant.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions

What's being tested
These problems test interval reasoning: detecting overlap, merging ranges, assigning resources, and applying many updates without touching every element each time. Interviewers are probing whether you can convert ranges into events, sort boundaries correctly, and choose between heap, line sweep, difference array, or reverse processing based on constraints.
Patterns & templates
-
Merge intervals — sort by start, maintain current
[lo, hi]; merge whennext.start <= hi, otherwise emit;O(n log n)time. -
Meeting rooms with min-heap — sort by start, pop rooms whose
end <= start, push current end; heap size is answer;O(n log n). -
Line sweep events — convert intervals to
(time, delta)events, sort with correct tie-breaking; prefix sum gives active count or resource demand. -
Skyline sweep — process building start/end events, track active heights using
max-heapplus lazy deletion orTreeMap; emit only height changes. -
Difference array / range add — for update
[l, r] += x, dodiff[l] += x,diff[r+1] -= x, then prefix;O(n + q). -
Range overwrite queries — process updates in reverse with
DSU next-unassignedor interval skipping so each index is assigned once; nearO((n+q) α(n)). -
Boundary convention — decide early whether intervals are closed
[s,e]or half-open[s,e); meeting rooms usually allow reuse whenend <= start.
Common pitfalls
Pitfall: Sorting starts before ends at the same coordinate can overcount overlaps for half-open intervals like
[1,3)and[3,5).
Pitfall: For skyline, emitting every event creates duplicate points; only append when the current maximum height actually changes.
Pitfall: Applying each range update directly is
O(nq)and will time out; look for prefix sums, lazy structures, or reverse assignment.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions

What's being tested
These problems test graph modeling, BFS/DFS traversal, shortest path under constraints, and connected-component reasoning. Interviewers are checking whether you can convert messy relationships—blocked stops, reporting chains, shared digits, word paths—into nodes, edges, state, and traversal rules.
Patterns & templates
-
BFS for unweighted shortest path — use
queue,visited, and distance levels;O(V + E)time,O(V)space. -
Dijkstra for weighted or penalty paths — use
heapqwith(cost, node); needed when dangerous nodes add non-uniform costs. -
Lexicographic shortest path state — optimize
(danger_count, steps)instead of one scalar; compare tuples directly in Python. -
DFS/backtracking for path existence — recursively match characters or constraints; mark/unmark
visitedto avoid cycles without blocking other branches. -
Connected components — build adjacency from shared attributes, then BFS/DFS each unvisited node; useful for grouping two-digit numbers.
-
Transitive relationship queries — model manager chains as a directed graph or parent map; use ancestor traversal, caching, or union-like grouping for peers.
-
Grid/graph neighbor generation — centralize
neighbors(node)logic; filter blocked nodes before enqueueing to avoid invalid states.
Common pitfalls
Pitfall: Treating weighted penalties like ordinary BFS; if costs differ, BFS no longer guarantees the optimal route.
Pitfall: Using one global
visitedset in backtracking; paths need branch-local state with unmarking on return.
Pitfall: Forgetting disconnected components, duplicate inputs, self-edges, cycles, or “start/end is blocked” edge cases.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions

What's being tested
Tree traversal and hierarchical invariant reasoning: you need to move confidently between recursive DFS, iterative BFS, parent-pointer arrays, and search-tree variants. Interviewers are probing whether you can derive the traversal order, maintain per-level/per-node state, prove correctness, and give tight O(n) time / space bounds.
Patterns & templates
-
Level-order BFS with
queue— processlevel_sizenodes per depth; supports right-side view, zigzag traversal, and shortest-by-depth reasoning inO(n)time. -
DFS by depth using
dfs(node, depth)— record first or last node seen per depth; preorder right-first solves right-side view cleanly. -
Alternating level output — for zigzag, append normally then reverse, or use
deque.appendleft; both areO(n), but avoid repeated front inserts into arrays. -
Search-tree traversal — exploit
BST/trinary ordering when useful, but mode finding still needs frequency tracking; handle duplicate keys and "middle/equal" child conventions explicitly. -
Parent-array validation — a valid tree has exactly one root, no cycles, all nodes connected, and exactly
n - 1parent edges forn > 0. -
Cycle detection — use DFS colors (
WHITE/GRAY/BLACK) or Union-Find; parent-pointer graphs often fail via self-parent, multi-root, or disconnected components. -
Complexity discipline — most solutions should be
O(n)time; auxiliary space isO(w)for BFS width,O(h)recursion depth, orO(n)for visited/state arrays.
Common pitfalls
-
Pitfall: Treating level-order traversal as “visit until queue empty” without freezing
level_size, which mixes depths and breaks right-side or zigzag output. -
Pitfall: Assuming “one root” is enough for a parent array; you must also prove no cycles and full connectivity.
-
Pitfall: Ignoring recursion depth on skewed trees; mention iterative traversal or stack limits when
h ≈ n.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions
-
Dynamic Programming, Backtracking, And Combinatorial Search — covered in depth under Take-home Project below.
-
Mutable Data Structure Design — covered in depth under Take-home Project below.
Software Engineering Fundamentals
What's being tested
Interviewers are probing whether you can design memory-bounded algorithms when the input is too large to fit in RAM, a common reality in search, logs, indexing, storage systems, and distributed services. Strong answers distinguish between streaming, batch chunking, external-memory, and distributed approaches instead of defaulting to “load everything into a hash map.” Google cares because many SWE problems are constrained by I/O, latency, fault tolerance, and correctness under scale, not just asymptotic CPU time. The interviewer is looking for clear assumptions, correct complexity analysis, and the ability to trade accuracy, memory, disk, and parallelism deliberately.
Core knowledge
-
Streaming algorithms process each record once or a small number of times, usually with , , or bounded memory. They are ideal for max/min, counts, checksums, online averages, reservoir sampling, and approximate frequency or quantile summaries.
-
Chunked processing is the default upgrade when input exceeds memory but can be scanned sequentially. Read fixed-size buffers, parse records across chunk boundaries carefully, compute partial results, then merge. For max value, each worker can compute a local max and reduce to a global max.
-
External sorting handles problems that need global ordering but exceed RAM. Sort chunks in memory, spill sorted runs to disk, then perform a k-way merge using a min-heap of size . Runtime is dominated by sequential disk I/O, not CPU comparison cost.
-
Hash partitioning is a powerful pattern for set intersection across huge files. Partition both files by
hash(object_id) % P, write corresponding partitions to disk, then process each pair independently in memory. Correctness depends on using the same deterministic hash and preserving full records or IDs. -
Bloom filters provide approximate set membership with false positives but no false negatives. For bits, inserted items, and hash functions, false-positive probability is approximately . They are useful to prefilter candidates before exact verification.
-
Two-heaps median maintains an exact online median when all values fit conceptually in memory: max-heap for lower half, min-heap for upper half, rebalance so sizes differ by at most one. Updates cost and space is , which fails for unbounded streams.
-
Approximate quantile summaries solve memory-constrained median at scale. Algorithms like Greenwald-Khanna, t-digest, and KLL sketches trade accuracy for bounded memory. A typical guarantee is rank error: returned value has rank within of the true median.
-
Reservoir sampling selects a uniform sample of size from an unknown-length stream using memory. The th item replaces a random existing sample with probability . It can estimate medians but gives probabilistic error rather than deterministic rank bounds.
-
I/O complexity matters for huge files. A theoretically algorithm with random disk seeks can be slower than an algorithm using sequential reads. Prefer sequential scans, buffered reads, compression-aware parsing, and minimizing intermediate materialization.
-
Parallel reduction works when the operation is associative and commutative, such as max, min, sum, count, bitwise OR, or set union. Median is not directly reducible from local medians; you need mergeable summaries, histograms, selection algorithms, or distributed quantile sketches.
-
Record boundary handling is a frequent edge case. Huge text files may split a number, JSON object, or log line across buffers. A robust solution keeps a carryover buffer, validates malformed records, defines overflow behavior, and handles empty files explicitly.
-
Exact versus approximate must be stated early. Exact shared-object detection may require disk partitioning or sorting; approximate detection can use Bloom filters. Exact median may require multiple passes or external storage; approximate median can be one-pass with bounded memory.
Worked example
For Find maximum value in huge text file, a strong candidate first clarifies the format: one integer per line or arbitrary delimiters, signed values, possible malformed records, file size, local disk versus distributed storage, and whether the answer must be exact. They would then declare the key assumption: the file is far larger than memory, but a sequential scan is possible. The answer skeleton is: stream the file in large buffers, parse numbers safely across chunk boundaries, maintain a running maximum, and optionally parallelize by splitting the file into byte ranges aligned to record boundaries. If using multiple workers, each worker computes a local maximum, and a final reducer computes the global maximum, which is valid because max is associative and commutative. Complexity is time over bytes or records, memory per worker excluding the input buffer, and one sequential pass over the file. A good tradeoff to flag is buffer size: larger buffers reduce syscall overhead but increase memory footprint and may complicate latency or worker scheduling. They should also mention edge cases: empty file, all negative numbers, integer overflow, partial final line, corrupted input, and encoding. A strong close would be: “If I had more time, I’d add checksums or byte-offset logging for restartability, benchmark buffer sizes, and compare single-machine scan versus distributed scan based on storage bandwidth.”
A second angle
For Find shared objects across two log files, the same large-input principles apply, but a single running variable is not enough because the problem is about set membership. If one file fits in memory, build a hash set of its object IDs and stream the other file, outputting matches in time. If neither fits, use hash partitioning: split both files into the same partitions by object ID, then load and intersect one partition pair at a time. If duplicate IDs matter, clarify whether the output is unique shared objects or multiplicities; that changes the data structure from a set to counts. If approximation is allowed, build a Bloom filter from one file and stream the other, then optionally verify candidates to remove false positives.
Common pitfalls
Pitfall: Treating “huge input” as only a Big-O problem.
Saying “use a hash map, it’s ” misses the central constraint: memory. A better answer explicitly calculates whether the structure fits, then proposes streaming, partitioning, external sorting, or approximate sketches when it does not.
Pitfall: Ignoring exactness requirements.
For median or shared IDs, approximate methods like sampling, Bloom filters, and sketches may be excellent, but only if false positives or rank error are acceptable. A strong candidate states the guarantee: exact output, no false negatives, bounded rank error, or probabilistic confidence.
Pitfall: Skipping parsing and boundary correctness.
Many candidates describe the high-level algorithm but forget that chunks can split records. Interviewers often push on this because production failures come from partial lines, malformed records, overflow, duplicate handling, and inconsistent delimiters.
Connections
This topic often pivots into distributed systems, especially map-reduce style aggregation, sharding, retries, and idempotent partial results. It also connects to probabilistic data structures such as Bloom filters, Count-Min Sketch, HyperLogLog, and quantile sketches, plus classic systems performance topics like buffering, disk seeks, compression, and backpressure.
Further reading
-
Jeffrey Dean and Sanjay Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters” — canonical paper for chunking, local computation, shuffle, and reduce patterns.
-
Greenwald and Khanna, “Space-Efficient Online Computation of Quantile Summaries” — foundational exact-style rank-error approach for streaming quantiles.
-
Martin Kleppmann, Designing Data-Intensive Applications — practical treatment of storage, batching, streams, indexes, and large-scale data processing tradeoffs.
Practice questions
System Design
What's being tested
Interviewers are probing whether you can design distributed systems that remain correct and debuggable under concurrency, partial failure, retries, replication lag, and uneven load. For Google SWE interviews, the bar is not “draw boxes,” but reasoning clearly about consistency guarantees, failure semantics, operational reliability, and observability tradeoffs at large scale. Strong answers name the guarantee they provide, the guarantee they intentionally do not provide, and how they would detect when the system violates expectations. Expect follow-ups around hot keys, duplicate delivery, clock skew, regional outages, `p99` latency, and what happens when dependencies degrade.
Core knowledge
-
Consistency models define what clients may observe after reads and writes. Know the difference between linearizability, sequential consistency, read-your-writes, eventual consistency, and monotonic reads. In design interviews, explicitly choose the weakest model that satisfies product correctness.
-
CAP theorem is useful only if made concrete: during a network partition, a replicated service must choose between availability and strong consistency. For example, a distributed rate limiter may prefer availability with bounded over-admission, while account balance updates usually require consistency.
-
Quorums are the standard replication tradeoff: with replicas, write quorum , and read quorum , overlapping reads/writes require . For
`N=3`,`W=2`,`R=2`tolerates one replica failure while preserving quorum reads. -
Consensus algorithms like Raft and Paxos provide a replicated log for leader election, configuration changes, and strongly consistent metadata. Use them for small, critical control-plane state, not for every high-throughput data-plane operation.
-
Idempotency is mandatory for retries. A client-supplied
`Idempotency-Key`, dedupe table, or operation ID lets the server convert “maybe succeeded” into safe retry behavior. Store idempotency records with request hash, response, status, and TTL. -
At-least-once delivery means duplicates are expected; exactly-once semantics usually means “effectively once” through idempotent consumers plus atomic state transitions. A queue like
`Kafka`or`Pub/Sub`can redeliver after consumer crashes, offset races, or ack timeouts. -
Rate limiting algorithms have different consistency and burst properties. Token bucket allows bursts up to bucket size; leaky bucket smooths traffic; sliding window log is accurate but memory-heavy; sliding window counter approximates with lower memory. Distributed limiters often accept bounded error.
-
Partitioning controls scalability and hot spots. Consistent hashing reduces remapping when nodes change, but does not eliminate hot keys; use virtual nodes, key salting, local aggregation, or special-case sharding for tenants with extreme QPS.
-
Replication strategy shapes availability and latency. Leader-follower replication simplifies writes but can bottleneck and create stale reads; multi-leader improves regional writes but introduces conflicts; leaderless replication uses quorums and repair but complicates conflict resolution.
-
Failure handling should be explicit: timeouts, retries with exponential backoff and jitter, circuit breakers, dead-letter queues, and graceful degradation. Avoid retry storms by bounding retry budgets and propagating backpressure instead of letting every layer independently retry.
-
Observability requires three complementary signals: metrics for aggregate health, logs for discrete events, and distributed traces for request paths. Useful metrics include
`availability`,`error_rate`,`p50/p95/p99_latency`, queue depth, retry rate, dedupe hit rate, and replica lag. -
SLOs turn reliability into engineering decisions. If a service targets
`99.9%`monthly availability, the error budget is about 43.2 minutes/month. Tie design choices to SLOs: synchronous quorum writes may protect correctness but consume latency budget.
Worked example
For Design at-least-once notification delivery, a strong candidate starts by clarifying notification types, scale, ordering needs, acceptable duplicate rate, retry horizon, and whether “delivered” means accepted by an external provider or actually seen by a user. They might declare assumptions: `100M` notifications/day, multi-channel delivery, user-level idempotency required, and no strict global ordering. The answer should be organized around four pillars: ingestion API, durable queueing, delivery workers, and idempotent state tracking.
The ingestion API accepts a `notification_id` or generates one, validates tenant/user/channel, persists a send request, and enqueues work only after durable storage succeeds. Workers consume from `Kafka`, `Pub/Sub`, or a sharded internal queue, call providers with timeouts, and ack only after recording the outcome. Retries use exponential backoff with jitter and move poison messages to a dead-letter queue after a bounded policy such as 24 hours or 10 attempts. The key design decision is that the platform guarantees at-least-once processing, while downstream side effects are made effectively once using provider idempotency keys, a dedupe table keyed by `(tenant_id, notification_id, channel)`, and atomic status transitions.
A strong candidate explicitly calls out that ordering is expensive: per-user FIFO partitions reduce concurrency and can be broken by retries, so they would only guarantee ordering where the product requires it. They would close by adding observability: dashboards for send latency, provider error rate, retry backlog, dead-letter volume, duplicate suppression rate, and traces from API request to provider call. If they had more time, they would discuss regional failover, tenant isolation, abuse controls, and replay tooling.
A second angle
For Design a distributed rate limiter, the same consistency and reliability ideas apply, but the correctness envelope is different. Instead of ensuring every message is eventually processed, the system must make fast admission decisions under concurrent requests from many frontend servers. A strongly consistent centralized counter gives accurate limits but may add latency and become a hot spot; local token buckets or sharded counters are faster but can over-admit during synchronization windows. A good answer quantifies the tolerated error, for example allowing up to 1–2% overshoot for availability, while using `Redis`, in-memory cells, or regional aggregators depending on latency and failure requirements.
Common pitfalls
Pitfall: Treating consistency as binary.
A tempting answer is “use strong consistency everywhere” or “eventual consistency is fine.” Better answers tie guarantees to specific operations: rate-limit checks may tolerate bounded staleness, idempotency records need strong uniqueness, and key-value metadata may need quorum reads while analytics counters can be eventually consistent.
Pitfall: Drawing components without failure semantics.
Many candidates sketch `API -> queue -> workers -> database` and stop there. Interviewers want to hear what happens when the API times out after a successful write, when the queue redelivers, when a worker crashes after calling an external provider, or when a replica is stale.
Pitfall: Mentioning observability as an afterthought.
“Add logs and monitoring” is too shallow. Name concrete signals and alerts: `p99` admission latency for a rate limiter, replication lag for a key-value store, dead-letter queue growth for notifications, remediation success rate for monitoring automation, and trace sampling for slow cross-service calls.
Connections
Interviewers may pivot into storage engine internals such as LSM trees versus B-trees, load balancing and hot-key mitigation, schema/API design for idempotent operations, or incident response using SLOs and error budgets. They may also ask for a deeper comparison of `Redis`, `Spanner`, `Bigtable`, `DynamoDB`, `Kafka`, or `Pub/Sub` depending on whether the design is latency-critical, storage-heavy, or queue-centric.
Further reading
-
Dynamo: Amazon’s Highly Available Key-value Store — foundational paper on consistent hashing, quorums, vector clocks, and availability-first storage.
-
The Raft Consensus Algorithm — approachable consensus paper for replicated logs, leader election, and strong metadata consistency.
-
Google SRE Book — practical treatment of SLOs, error budgets, monitoring, alerting, and reliable production operations.
Practice questions
What's being tested
This cluster tests whether you can design stateful backend systems where storage layout, indexing, API semantics, authorization, replication, and execution safety all interact. Google cares because many production services are not just CRUD APIs: they require global consistency tradeoffs, high-throughput storage, secure multi-tenancy, and clear failure behavior under partial outages. The interviewer is probing for your ability to turn requirements into data models, choose the right consistency guarantees, expose APIs that survive retries and concurrency, and defend the system against abuse or corruption. Strong answers make tradeoffs explicit instead of assuming one database, one cache, or one queue solves everything.
Core knowledge
-
Storage model selection should follow access patterns, not preference. Use
`Postgres`/`Spanner`for transactional metadata, object storage such as`GCS`/`S3`for blobs, log-structured storage for append-heavy workloads, and search indexes such as`Elasticsearch`/`Lucene`only when query shape justifies index maintenance cost. -
Content-addressable storage stores data by hash, for example
`sha256(content) -> blob`, enabling deduplication across files or tenants. Collision risk is negligible with cryptographic hashes, but authorization must not be based on “knowing the hash”; metadata ACLs still gate access to each logical file. -
Reference counting for deduplicated blobs works when increments/decrements are transactional with file metadata. The hard case is garbage collection: avoid deleting blobs immediately after count reaches zero unless you handle races with in-flight writers, usually via mark-and-sweep, tombstones, or delayed deletion windows.
-
Append-only logs are typically partitioned by key or stream, with offsets assigned monotonically per partition. Throughput scales roughly with number of partitions: but ordering is only guaranteed within one partition, not globally.
-
Replication choices define durability and latency. Leader-follower replication with quorum writes can require for strong reads in quorum systems, while asynchronous replication improves write latency but risks data loss or stale reads during failover.
-
Indexing is a write/read tradeoff. A primary key index supports point lookups, secondary indexes support alternate access paths, inverted indexes support text search, and sparse/time indexes reduce cost for logs. Every index increases write amplification and backfill complexity.
-
API idempotency is mandatory for unreliable networks. For mutating endpoints like
`POST /submissions`or`POST /notes/{id}/edits`, accept an`Idempotency-Key`and store request fingerprints so client retries do not create duplicate executions, files, or log segments. -
Concurrency control can be pessimistic locks, optimistic version checks, or operation-based merges. Collaborative editing usually needs CRDTs or operational transformation when multiple clients edit offline; simpler
`version`/`etag`compare-and-swap is acceptable for low-contention metadata. -
Consistency semantics must be stated per operation. A restaurant menu update may need read-your-writes for admins but eventual consistency to devices; a coding judge submission needs durable enqueue before returning success; a collaborative note may need low-latency local echo with later convergence.
-
Secure execution for untrusted code requires sandboxing, not just process separation. Use containers,
`seccomp`, namespaces, cgroups, read-only filesystems, network isolation, CPU/memory/time limits, and per-run scratch directories; assume submitted code is malicious, fork-bombs, exfiltrates, or exploits syscalls. -
Multi-tenant isolation spans data, compute, and quotas. Store tenant ownership in metadata rows, enforce authorization at every API boundary, rate-limit expensive operations, and avoid cross-tenant dedupe leaks such as revealing that another tenant already uploaded a file.
-
Observability and recovery need first-class design. Track
`p50`/`p99`latency, error rate, queue depth, replication lag, judge sandbox failures, index freshness, and garbage-collection backlog. Design replay paths from durable logs and use checksums to detect storage corruption.
Worked example
For Design an Online Coding Judge Platform, a strong candidate starts by clarifying the core flow: users submit code for a problem, the system compiles/runs it against test cases, returns verdicts, and stores submission history. In the first 30 seconds, ask about expected submissions per second, languages supported, maximum runtime, whether test cases are public/private, and whether verdict latency or throughput is more important. Declare assumptions such as “I’ll optimize for bursty contest traffic, untrusted code, and per-submission results within seconds.”
The answer skeleton should have four pillars: an API layer, durable submission storage, asynchronous execution workers, and sandboxed judging infrastructure. The API might expose `POST /submissions`, `GET /submissions/{id}`, and `GET /problems/{id}`, with idempotency keys on submission creation. Metadata can live in `Spanner` or `Postgres`, large test artifacts in object storage, and a queue such as `Pub/Sub` or `Kafka` can decouple request traffic from judge capacity.
The most important design decision to flag is that execution workers must not trust user code. Each run should execute inside an isolated sandbox with `cgroups` limits, `seccomp` syscall filtering, disabled outbound network, bounded filesystem, and strict wall-clock timeout. Another explicit tradeoff: pre-warming language runtimes improves latency but increases worker cost and isolation complexity. Close by saying that, with more time, you would cover contest-scale fairness, plagiarism detection hooks, regional capacity, and detailed replay/debug tooling for disputed verdicts.
A second angle
For Design deduplicated file storage on filesystem, the same storage and API thinking applies, but the center of gravity shifts from compute isolation to blob identity, metadata correctness, and lifecycle management. Instead of sandboxing untrusted programs, the risk is unauthorized data inference, cross-tenant leakage, and corrupted reference counts. A good design separates logical file metadata from physical chunks: `file_id -> owner, path, chunk_hashes` and `chunk_hash -> location, size, ref_count`. The API should make upload retries safe, perhaps by using resumable upload sessions and chunk-level checksums. The key tradeoff is dedupe granularity: whole-file dedupe is simpler and cheaper, while chunk-level dedupe saves more storage but increases metadata size, random reads, and garbage-collection complexity.
Common pitfalls
Pitfall: Treating storage as “just use a database.”
A tempting weak answer is to put notes, logs, files, menus, and submissions all in one relational table model. What lands better is mapping each object to its dominant access pattern: append logs need segment files and offset indexes, files need blob storage plus metadata, menus need versioned configs and rollout windows, and submissions need durable state transitions.
Pitfall: Hand-waving consistency instead of naming guarantees.
Saying “eventual consistency is fine” or “we need strong consistency” globally is usually too vague. Specify where guarantees matter: exactly-once user-visible submission creation via idempotency, at-least-once worker processing with idempotent state transitions, read-your-writes for admin menu edits, and convergence for collaborative note replicas.
Pitfall: Mentioning security only at the end.
For a coding judge or multi-tenant storage service, security is not an add-on. A better answer introduces threat models early: malicious code execution, tenant boundary violations, hash oracle attacks, quota abuse, and unauthorized reads through caches or indexes.
Connections
Interviewers may pivot from here into distributed consensus, cache invalidation, schema evolution, rate limiting, or disaster recovery. They may also ask for deeper treatment of CRDTs for collaborative editing, LSM trees and compaction for log storage, or capability-based authorization for secure APIs.
Further reading
-
The Google File System — classic paper on chunked distributed storage, replication, and operational tradeoffs.
-
Bigtable: A Distributed Storage System for Structured Data — useful for understanding tablets, sparse indexes, and large-scale storage design.
-
Operating Systems: Three Easy Pieces — strong foundation for processes, isolation, filesystems, and resource management relevant to secure execution.
Practice questions
Behavioral & Leadership
- Behavioral Leadership, Collaboration, And Ambiguity — covered in depth under Take-home Project below.
Take-home Project
Coding & Algorithms

What's being tested
These problems test string scanning, substring enumeration, and sliding-window invariants under character-count, dictionary, or movement constraints. Interviewers look for whether you can reduce brute-force substring checks to linear or near-linear algorithms using counts, hashes, tries, or two pointers.
Patterns & templates
-
Sliding window with frequency counts — maintain
need,have, andformed; expand right, contract left; typicalO(n)time,O(Σ)space. -
Anagram detection via multisets — compare character counts instead of sorting each substring;
O(n * alphabet)or optimizedO(n)rolling deltas. -
Substring membership checks — use
HashSet<String>for direct lookup; for many prefixes, use a Trie or dynamic programming like word-break. -
Binary search on answer length — for longest duplicate substring, check existence with rolling hash or suffix-array-style logic; target
O(n log n). -
Movement/token invariants — scan both strings while skipping blanks; verify token order, wall positions, and directional constraints in
O(n)time. -
Multiplicity-aware coverage — minimum covering substring requires exact counts, not just presence; decrement carefully when shrinking the window.
Common pitfalls
Pitfall: Sorting every candidate substring for anagram checks often turns an intended
O(n)orO(nk)solution intoO(nk log k).
Pitfall: Treating dictionary-word checks as greedy prefix matching fails; use DP or backtracking with memoization when segmentation is ambiguous.
Pitfall: Rolling hash solutions need collision discussion; mention double hashing or final substring verification.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions

What's being tested
These problems test frequency analysis: converting raw inputs into counts, then using those counts to answer grouping, mode, subset, and top-K queries efficiently. Interviewers are probing whether you can choose the right hash map, avoid double-counting, and reason about ordering/tie-breaking under realistic constraints.
Patterns & templates
-
Hash map counting with
`dict`/`HashMap`— buildvalue -> countinO(n)time; initialize with`defaultdict(int)`or`getOrDefault`. -
Feature extraction before counting — map each item to digits, letters, coordinates, or keys; count features, not necessarily original values.
-
Set deduplication per item — for digit-sharing subsets, count each digit once per number;
112contributes once to digit1, not twice. -
Mode tracking during traversal — update
maxFreqwhile visiting tree nodes; avoid a second pass unless simpler and memory allows. -
Top-K selection using heap or bucket sort —
O(n log k)with heap; bucket works when frequencies are bounded byn. -
Composite ordering — implement comparator carefully for
(frequency desc, distance asc, id asc); tie-breaking bugs are common in top-K queries. -
Grid/component counting — use
visitedplus DFS/BFS; frequency maps may combine with traversal for island sizes or target-word multiset counts.
Common pitfalls
Pitfall: Counting repeated features within the same element, e.g. treating digit
7twice in707when the subset condition only needs membership.
Pitfall: Sorting the entire frequency table for top-K when
kis small; a size-`k`heap is usually cleaner and faster.
Pitfall: Ignoring ties for mode or top-K; Google interviewers often expect deterministic output or an explicit tie policy.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions
![]()
What's being tested
These problems test state-space modeling, dynamic programming, and backtracking over constrained combinatorial choices. Interviewers look for whether you can convert messy rules into compact states, prune impossible branches, prove optimal substructure, and implement without exponential blowups where avoidable.
Patterns & templates
-
Net-state reduction — collapse inputs into balances, counts, or masks before search; smaller canonical state often changes brute force from impossible to tractable.
-
Backtracking with pruning via
dfs(i, state)— try valid choices, undo mutations, skip duplicates; worst-case exponential, but acceptable for smalln. -
Memoization with
@lru_cacheorMap<State, Ans>— cache tuple/count-vector/bitmask states; watch mutable arrays as cache keys. -
Bitmask DP for subsets — use
dp[mask]or recursivesolve(mask); typical complexityO(n * 2^n)orO(3^n)depending transitions. -
Multiset partitioning — represent tiles/items as frequency counts; recursively remove groups, restore counts, and terminate when all counts are zero.
-
Minimax / candidate filtering — for feedback-driven guessing, maintain feasible candidates and choose guesses minimizing worst-case remaining set.
-
Kadane-style DP on arrays — track best subarray ending here, prefix minima/maxima, or boundary-conditioned states in
O(n)time.
Common pitfalls
Pitfall: Starting with raw permutations instead of compressed counts or balances creates factorial search and usually times out.
Pitfall: Forgetting to restore mutated state after a recursive call causes silent corruption across branches.
Pitfall: Caching by non-canonical state, such as unsorted balances, misses equivalent subproblems and destroys memoization effectiveness.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions

What's being tested
Mutable data structure design tests whether you can maintain changing state with clear invariants, efficient updates, and correct tie-breaking. Expect queues, heaps, linked lists, hash maps, ordered sets, and capacity accounting under operations like insert, delete, move, expire, and evict.
Patterns & templates
-
Hash map + doubly linked list for
O(1)insert/delete/move; use sentinels to simplify edge cases around head/tail updates. -
Heap-backed scheduler for timers or eviction —
O(log n)add/cancel/pop; handle stale heap entries with lazy deletion viaid -> active. -
Two queues simulation for turnstiles/waitlists — process time monotonically, preserve FIFO within classes, encode priority rules explicitly.
-
Ordered waitlist selection often needs
TreeMap/balanced BST by party size or timestamp; tradeO(log n)updates for fast capacity queries. -
Capacity accounting invariants — maintain global bytes/items and per-owner totals; every mutation must update all counters atomically.
-
Marker/cursor sequence design — use linked list, gap buffer, rope, or indexed tree depending on whether moves, inserts, deletes, or random access dominate.
-
API-first reasoning — define operation semantics, return values, duplicate handling, cancellation behavior, and tie-breaking before choosing data structures.
Common pitfalls
Pitfall: Picking an array/list first, then discovering middle deletion or cursor movement makes operations
O(n)under heavy mutation.
Pitfall: Forgetting stale heap entries after cancellation or priority changes; always validate popped entries against current authoritative state.
Pitfall: Leaving tie-breakers implicit, especially same timestamp, same priority, empty-state behavior, or simultaneous capacity constraints.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions
Behavioral & Leadership
What's being tested
Google behavioral interviews for Software Engineers test whether you can turn ambiguous, human-heavy engineering situations into clear action without losing technical rigor. The interviewer is probing for ownership, collaboration, conflict resolution, prioritization, and learning velocity in realistic software team scenarios: disagreeing on a design, handling a slipping project, responding to an incident, or influencing without authority. Google cares because most SWE impact comes through shared codebases, cross-team dependencies, design reviews, on-call rotations, and long-lived systems where communication failures become reliability, maintainability, and delivery failures. A strong answer shows not just “I was nice to teammates,” but how you used engineering judgment, evidence, and structured communication to move the team toward a better technical outcome.
Core knowledge
-
STAR is the default structure: Situation, Task, Action, Result. Keep Situation/Task under 30% of the answer; spend most time on specific actions you personally took and measurable outcomes such as reduced
p99latency, fewer rollbacks, faster reviews, or improved launch confidence. -
Ownership means acting beyond your assigned ticket while staying inside engineering responsibility. Good examples include identifying a hidden reliability risk, writing a migration plan, improving test coverage, escalating a dependency blocker, or creating a design doc—not taking over product strategy or making unilateral business decisions.
-
Collaboration is evaluated through mechanisms, not personality claims. Mention concrete behaviors: design review comments, RFCs, pairing sessions, incident channels, meeting notes, decision logs, API contracts, rollout checklists, and post-launch monitoring using dashboards in tools like
Prometheus,Grafana, or internal equivalents. -
Conflict resolution should be evidence-driven. A strong SWE disagreement answer compares tradeoffs like consistency versus availability, short-term delivery versus maintainability, synchronous RPC versus async queueing, or SQL simplicity versus denormalized read performance. The best answers show you changed your mind when data or constraints justified it.
-
Ambiguity handling starts by separating knowns, unknowns, and assumptions. In engineering, this often means clarifying scale, latency budget, failure modes, data ownership, API compatibility, migration constraints, and launch criteria before coding. Say what you validated through prototypes, logs, traces, or stakeholder review.
-
Prioritization should connect engineering work to risk and user impact. A practical model is: prioritize items with high blast radius, irreversible consequences, or dependency-unblocking value. For example, a flaky test blocking releases may outrank a nice refactor; an auth bug outranks a UI polish issue.
-
Leadership without authority means creating alignment through clarity. For new grads, this may be organizing debugging notes, proposing a small design, or unblocking a teammate. For senior/staff candidates, it includes cross-team technical direction, migration sequencing, risk registers, and aligning owners across services.
-
Incident response examples should show calm triage. A strong structure is: detect via alert or symptom, establish severity, mitigate first, communicate status, identify root cause, roll forward or roll back, then write a blameless postmortem with action items. Avoid spending the story only on hero debugging.
-
Blameless postmortems focus on system improvements rather than individual fault. Strong follow-ups include better alerts, circuit breakers, runbooks, integration tests, load tests, canary deploys, feature flags, or safer defaults. Tie the lesson to a future behavior change, not just “we communicated better.”
-
Measurable impact makes behavioral answers credible. Use numbers where honest: “cut build time from 18 minutes to 9,” “reduced duplicate support escalations by 40%,” “migrated 12 services,” “raised test coverage around the parser from 55% to 82%,” or “kept error budget burn below threshold.”
-
Googleyness is not performative niceness; it is intellectual humility plus high standards. Show that you listened, asked clarifying questions, gave credit, made disagreement safe, and still pushed for the technically correct outcome when reliability, security, or maintainability were at risk.
-
Level calibration matters. A new grad story can be scoped to a class project, internship, or small feature if it shows reflection and learning. A staff-level story should include ambiguous ownership, multiple teams, architectural tradeoffs, risk management, and durable process or technical improvements.
Worked example
For “Answer core teamwork and conflict stories,” a strong candidate would frame the first 30 seconds by saying: “I’ll use a story where my teammate and I disagreed on whether to ship a quick patch or do a safer refactor; the key issue was balancing release timing against maintainability and regression risk.” They would clarify the context: team size, system surface area, deadline, and what they personally owned. The answer skeleton should have four pillars: first, describe the technical disagreement; second, explain how they gathered evidence; third, show how they aligned the team; fourth, give the outcome and lesson.
For example, the candidate might say the team was building a backend API where one engineer wanted to add another conditional branch into a legacy handler, while the candidate believed the code path needed extraction and tests because the handler already had several production bugs. A strong action section would include reviewing recent incidents, measuring test gaps, sketching both options in a short design note, and proposing a compromise: ship a minimal safe fix behind a feature flag, then immediately follow with a targeted refactor. The explicit tradeoff is delivery speed versus long-term maintainability; the mature answer does not pretend one side was obviously wrong. The result should include concrete impact, such as no rollback during launch, added regression tests, and a clearer owner for the refactor. The close should be reflective: “If I had more time, I would have raised the maintainability risk earlier in planning rather than during implementation, because late conflict narrows the solution space.”
A second angle
For “Answer Staff-level leadership scenarios using STAR,” the same concept applies, but the scale and ambiguity are larger. Instead of resolving one teammate disagreement, the candidate may need to align three teams on a migration from a legacy service to a new API while preserving backward compatibility and availability. The framing should include cross-team ownership, technical risk, migration sequencing, and communication cadence. A staff-level answer should show influence mechanisms: design docs, review councils, rollout phases, SLO impact analysis, and explicit decision records. The interviewer will expect more than “I coordinated meetings”; they want evidence that your leadership reduced technical uncertainty and created a path others could execute.
Common pitfalls
Pitfall: Giving a generic teamwork story with no technical substance.
A weak answer says, “We had different opinions, so I listened and compromised.” That sounds pleasant but not diagnostic for a SWE role. A stronger answer names the actual engineering disagreement—API design, schema compatibility, testing strategy, rollout safety, ownership boundaries—and explains how evidence changed the decision.
Pitfall: Over-indexing on heroics.
“I stayed up all night and fixed production alone” can backfire if it implies poor collaboration or brittle process. Better is: “I mitigated the issue, kept the incident channel updated, asked another engineer to verify the rollback, and later added an alert and runbook so the team would not depend on one person.”
Pitfall: Hiding the lesson or making yourself flawless.
Behavioral interviews reward self-awareness. Avoid stories where every other person was wrong and you saved the day. A more credible close is: “I was right about the technical risk but wrong in how late I raised it, so now I surface design concerns earlier with a short RFC.”
Connections
Interviewers may pivot from behavioral leadership into system design tradeoffs, incident management, code review culture, or technical project execution. Be ready to connect your story to concrete SWE practices: design docs, testing strategy, staged rollouts, observability, and maintainable ownership boundaries.
Further reading
-
Google SRE Book: Postmortem Culture — useful model for blameless incident reflection and durable follow-up actions.
-
Google Engineering Practices Documentation — practical guidance on code reviews, readability, and collaborative engineering standards.
-
Crucial Conversations by Patterson, Grenny, McMillan, and Switzler — strong framework for handling high-stakes disagreement without becoming vague or adversarial.
Practice questions