Adobe Search Indexing And Autocomplete
Asked of: Software Engineer
Last updated
What's being tested
Candidates must show practical mastery of building low-latency autocomplete and search indexing components: choosing the right data structures, indexing strategy, update/refresh semantics, and tradeoffs between memory, CPU, and latency under realistic scale. Interviewers probe for correct algorithmic complexity, sharding/caching designs to meet p99 latency targets, and safe incremental-update patterns that avoid stale or inconsistent suggestions.
Core knowledge
-
Inverted index: primary structure for full-text search mapping token → posting list; good for scoring/ranking but not ideal for prefix-autocomplete unless you index edge n-grams or use specialized suggesters.
-
Trie / Prefix tree: O(L) lookup for a prefix of length L, supports lexicographic top-k if nodes keep top-k aggregates; memory grows with character branching—compress using compressed trie or FST.
-
FST (Finite State Transducer): compact, disk-friendly automaton used by
Lucenefor memory-efficient prefix/suffix lookup and weighted suggestions; lookup is O(L) and storage can be an order of magnitude smaller than naive tries. -
Edge n-gram vs tokenization: edge n-gram generates prefixes at index time (e.g., "adobe" → "a","ad","ado"...), increasing index size by ~Σ(length) but makes queries simple
termlookups; on-the-fly prefix scanning keeps index smaller but needs specialized structures. -
Completion index vs inverted index: maintain a separate completion index (trie/FST) optimized for latency and top-k, while using inverted index for full-text ranking and fallback results.
-
Real-time updates / refresh: choose between batched refresh (lower write amplification, stale suggestions) and near-real-time per-document update (higher CPU/memory); tune refresh interval (e.g., 1s–5s) and use background merges to amortize cost.
-
Sharding & routing: shard by hash (balanced load) or by prefix (localized hot-prefixes) — hashing simplifies writes; prefix routing can reduce query fanout but risks load skew on hot prefixes. Total QPS ~ shards ×
qps_per_shard; plan for hot-shard mitigation. -
Caching & TTLs: cache top suggestions per prefix in
Redis/in-process LRU for sub-ms hits; use short TTLs to avoid strong staleness; preheat caches for known peak prefixes. -
Fuzzy / typo-tolerant autocomplete: use edit-distance automata, BK-trees, or fuzzy FSTs; these increase CPU and memory complexity (exponential in edit radius) and usually limited to small radii (1–2).
-
Memory & compression: techniques like front-coding, delta-encoded posting lists, and packed ints reduce memory; an FST often reduces storage by sharing common suffixes/prefixes dramatically.
-
Latency budgets & SLOs: aim for strict
p99(e.g., <50ms) for autocomplete; design for tail latency: prioritize single-shard, in-memory lookups, avoid blocking I/O during queries. -
Consistency model: eventual consistency is acceptable for suggestions; for strong consistency (e.g., recent admin removals), add tombstones or immediate invalidation hooks for caches.
Worked example — "Design an autocomplete service that returns top-k suggestions with sub-50ms p99 latency and supports near-real-time updates"
First 30s: clarify QPS, dataset size (N), average suggestion length, update rate, and SLOs for p50/p99 and freshness. State assumptions: N≈50M phrases, QPS≈5k, updates ≈1000/day, and p99 target 50ms.
Skeleton answer pillars: (1) offline build + compact runtime index (FST) for memory-efficient prefix lookup; (2) write path with a short refresh window (1–2s) that applies new entries to an in-memory delta trie; (3) query path that checks the in-memory delta, then the persisted FST, merging top-k by weight; (4) sharding by hash to balance QPS with per-shard in-memory caches; (5) engineering for tail-latency (thread pools, local caches).
Design decision to flag: using an FST + in-memory delta trades slightly stale but cheap main index for immediate visibility via deltas — costs extra merge complexity and memory but keeps queries single-shard and in-memory. Also call out fallback: if top-k source exhausted, fallback to inverted-index search with scoring.
Close: say how to validate (load tests targeting p99, memprofs showing FST size), and add "if more time" items: implement typo-tolerance via edit-distance automata, and design automated index compaction/merge policies.
A second angle — "Scale autocomplete to support millions of entries and tight memory budget on each node"
Same core concept but different constraints: emphasize memory-first choices and approximate structures. Use a compact FST or a disk-backed compressed trie; store only top-k weights per node to prune size. Use prefix hashing + bloom filters to quickly reject misses (O(1) checks) and avoid expensive disk seeks. For heavy-write environments, buffer updates into a small in-memory delta and periodically merge offline to the compact store to avoid rebuilding large compressed structures frequently. Call out tradeoffs: aggressive compression reduces RAM but may increase p99 due to decompression/IO; explain fallback to a remote read for cold prefixes and prewarming for hot prefixes.
Common pitfalls
Pitfall: Treating the inverted index as a drop-in autocomplete solution — you may hit high-latency fanout and heavy aggregation costs; a specialized completion index or precomputed top-k per-prefix is often required.
Pitfall: Ignoring hot-prefix skew — shard-by-hash without hot-key mitigation leads to overloaded nodes; propose per-prefix throttling, replica placement, or hot-key caches.
Pitfall: Over-optimizing for average latency — interviewers expect tail-latency reasoning (thread pools, non-blocking IO, bounded queues) and instrumentation plans for
p99measurement, not just average-case numbers.
Connections
Interviewers will often pivot to adjacent areas like query ranking (how suggestions are scored and re-ranked server-side) and logging/telemetry (how to measure suggestion CTR, latency percentiles, and freshness). They may also ask about index rebuild/upgrade and safer deployment strategies (canary merges, versioned indices).
Further reading
-
Lucene FSTs and Completion Suggester — concrete implementation details and tradeoffs behind
Lucene's suggester. -
Practical Autocomplete (blog posts by engineering teams) — real-world accounts of hybrid
FST+ delta approaches and cache strategies.
Related concepts
- Adobe Asset Search Indexing And Autocomplete
- Adobe Document Cloud Search Indexing And Autocomplete
- Adobe Creative Cloud asset search, indexing, autocomplete, and sharding
- Search, Autocomplete And Restaurant DiscoverySystem Design
- Google-Scale Search Indexing and Ranking
- Adobe Multi-Tenant Sharding And Access Control