API Design, Data Modeling, and Indexing

What's being tested

These prompts test whether you can turn ambiguous product requirements into API contracts, data models, and indexes that support real access patterns at scale. Google interviewers are probing for more than “pick a database”: they want to see whether you can define entities, relationships, consistency boundaries, query shapes, write paths, and failure modes before drawing boxes. Strong answers show how data layout changes when the dominant operation is a lexicographic range scan, a scheduled global content update, deterministic experiment assignment, or dependency-aware task execution. For a Software Engineer, the core signal is whether your design makes the important operations correct, efficient, evolvable, and debuggable.

Core knowledge

APIs should be shaped around use cases, not tables. Start with operations such as `CreateExperiment`, `AssignVariant`, `UpdateMenuItem`, `GetWordsInRange`, or `SubmitTaskDAG`. Specify request fields, idempotency behavior, pagination, authorization boundaries, and error semantics before choosing storage.
Data modeling begins with access patterns. A task scheduler needs lookups by `task_id`, runnable tasks by dependency state, and attempts by worker. A menu system needs reads by `restaurant_id`, locale, effective time, and version. A word store needs sorted traversal, prefix/range lookup, and maybe point updates.
Primary keys optimize identity; secondary indexes optimize questions. In `Spanner`, `Bigtable`, `DynamoDB`, or `Postgres`, the primary key is not just uniqueness; it determines locality, ordering, and hot-spot risk. A bad key like monotonically increasing `created_at` can create write hotspots under high QPS.
Composite indexes must match filter and sort order. For a query like WHERE restaurant_id = ? AND locale = ? AND effective_at <= now ORDER BY effective_at DESC LIMIT 1, an index on (restaurant_id, locale, effective_at DESC) is materially better than separate indexes on each column. Left-prefix rules matter in B-trees.
B-trees are the default for ordered storage and range queries. They support point lookup and range scans in $O(\log N + K)$ , where $K$ is the number of returned records. For lexicographic strings, a B-tree, LSM-tree with sorted-string tables, trie, or finite-state transducer can all work depending on update rate and compression needs.
Tries are strong for prefix queries but not always for arbitrary lexicographic ranges. A trie can enumerate terms between `start` and `end`, but traversal and memory overhead can be high for large vocabularies. Compressed tries, radix trees, or immutable FSTs reduce memory for read-heavy word dictionaries.
Versioning is a first-class data modeling tool. Global menu updates and experiment configs should avoid in-place mutation of live state. Store immutable versions like `menu_version`, `experiment_revision`, or `config_etag`, then atomically publish pointers. This enables rollback, audit, cache invalidation, and reproducible reads.
Effective-time modeling separates write time from visibility time. A scheduled update needs fields like `created_at`, `effective_at`, `expires_at`, and `published_at`. The read path should answer “what is active at time T?” without scanning all historical records, usually via composite indexes or precomputed active snapshots.
Idempotency protects write APIs from retries. For operations like `CreateTask`, `PublishMenuVersion`, or `LogExposure`, accept an idempotency key and store request fingerprints. A retried request should return the original result, not create duplicate tasks, versions, or telemetry events.
Consistency choices should be explicit per operation. Experiment assignment usually needs deterministic local computation from `hash(user_id, experiment_id, salt)`, while config publication may need strongly consistent reads. Menu display can often tolerate bounded staleness; elevator control cannot tolerate stale local state for safety-critical commands.
Pagination must be stable under mutation. Offset pagination becomes expensive and inconsistent for large changing datasets. Prefer cursor pagination using the last seen sort key, e.g. (word, doc_id) or (effective_at, version_id), and define whether results are snapshot-consistent or best-effort.
Hot keys and skew are common interviewer pivots. A celebrity restaurant, huge experiment, or popular prefix like "a" can overload a partition. Mitigations include sharded keys, read-through caches, materialized views, prefix bucketing, adaptive splitting, and moving heavy fanout work off the request path.

Worked example

For Design a global restaurant menu update system, a strong first 30 seconds would clarify scale, freshness, and correctness: “How many restaurants and menu items? Are updates scheduled in advance? Do clients need offline sync? Are prices legally required to be correct at display time? Do restaurants have regional overrides and localization?” Then declare assumptions: millions of restaurants globally, high read-to-write ratio, menus versioned, updates may be scheduled, and clients fetch deltas.

The answer can be organized around four pillars. First, define APIs: `CreateMenuDraft`, `UpdateMenuItem`, `PublishMenuVersion`, `GetActiveMenu`, and `SyncMenuChanges(since_token)`. Second, model data around immutable versions: `Restaurant`, `MenuVersion`, `MenuItem`, `LocalizedText`, `Price`, `AvailabilityWindow`, and `Override`, with `effective_at` and `expires_at` fields. Third, design indexes for the main reads: (restaurant_id, locale, effective_at DESC), (restaurant_id, version_id), and change-log ordering by (restaurant_id, sequence_number). Fourth, discuss propagation: a write path that validates and publishes a new version, plus caches/CDN or regional replicas for low-latency reads.

One tradeoff to flag explicitly is active snapshot versus compute-on-read. Computing “current menu” from base items, overrides, localization, and time windows gives flexibility but can make every read expensive and error-prone; materializing an active menu per restaurant/locale/version improves latency and offline sync at the cost of more write-time work and storage. A good close would say: “If I had more time, I’d cover conflict resolution for concurrent edits, schema evolution for new item attributes, and observability around stale-menu rates and publish latency.”

A second angle

For Design lexicographic range query word store, the same API/data/indexing discipline applies, but the dominant operation is an ordered scan instead of versioned entity retrieval. The API might expose `PutWord`, `DeleteWord`, and `RangeQuery(start, end, limit, cursor)`, with precise inclusivity rules for `start` and `end`. The data model may be as simple as `word` -> `metadata`, but the storage engine matters: a B-tree or LSM-backed sorted table supports $O(\log N + K)$ range reads, while a trie may be better for prefix-heavy workloads. The key design decision is whether the system is mutable and write-heavy, which favors `RocksDB`/LSM-style storage, or mostly static and memory-sensitive, which favors compressed tries or FSTs.

Common pitfalls

Pitfall: Starting with infrastructure boxes before defining query patterns.

A tempting but weak answer is “use `Postgres`, add indexes, put `Redis` in front.” That skips the interviewer’s real test: which keys, which indexes, which consistency guarantees, and which APIs make the system correct. Land better by listing the top reads/writes first, then deriving storage and cache choices from them.

Pitfall: Treating indexes as magic instead of physical design.

Candidates often say “index every field we filter on,” which increases write amplification and may not help composite queries. Show understanding of ordered composite indexes, left-prefix behavior, cardinality, selectivity, and covering indexes. For example, (restaurant_id, locale, effective_at DESC) answers the active-menu lookup; independent indexes on `restaurant_id` and `effective_at` may still require large scans.

Pitfall: Ignoring evolution and operational correctness.

A design that only handles the happy path often misses retries, duplicate writes, rollback, schema changes, and pagination stability. For systems like experiment platforms, menu publishing, and task schedulers, immutable versions, idempotency keys, audit logs, and explicit state transitions are not extras; they are what keep production behavior explainable.

Connections

Interviewers often pivot from this area into distributed consistency, caching and invalidation, state-machine design, or workflow orchestration. For example, an experiment platform can lead to deterministic hashing and telemetry ingestion, while a task scheduler can lead to DAG algorithms, leases, retries, and worker heartbeats. Be ready to connect your data model to latency, correctness, and failure recovery.

What's being tested

Core knowledge

Worked example

A second angle

Common pitfalls

Connections

Further reading

Practice questions

Related concepts