Adobe Asset Search Indexing And Autocomplete

What's being tested

Interviewers probe your ability to design a low-latency, scalable autocomplete and search indexing subsystem for asset metadata: data structures, update/refresh semantics, consistency vs latency tradeoffs, and operational costs. They want clear choices between approaches (tries/FSTs vs n-grams vs inverted indexes), how to shard and serve suggestions under p95/p99 latency budgets, and how to handle real-time updates and memory constraints. You should expose tradeoffs, failure modes, and an engineering plan (API, indexing cadence, observability) rather than a polished ML ranking model.

Core knowledge

Inverted index: Primary structure for full-text search mapping tokens → posting lists; great for arbitrary text queries, scales to tens/hundreds of millions of docs when sharded and compressed (delta + varint).
Finite State Transducer (FST) / Trie: Efficient prefix-completion structure mapping prefixes → completions with compact memory; Lucene uses FSTs for fast suggestion lookup and low heap overhead.
Edge n-grams vs token prefixing: Edge n-grams ("asse"→"asset") allow substringless prefix completion with simpler indexing but larger index size; time O(k) per lookup (k = prefix length) versus FST O(k) with better memory.
Autocomplete vs Suggest/Did-You-Mean: Autocomplete expects low-latency prefix completions from indexed tokens; syntactic suggestions (fuzzy/typo) use Levenshtein or phonetic algorithms and cost more CPU and index structures.
Real-time vs batch indexing: Near-real-time (NRT) uses frequent refresh/commit to surface new docs; NRT gives visible results within seconds but increases commit cost and resource churn—use small refresh intervals or incremental writers.
Sharding and routing: Shard by asset-id or hash to balance index size; autocomplete often needs cross-shard fanout for prefix queries—reduce fanout by routing suggestions to a global lightweight completion index or replicated hot shards.
Memory and compression: Use FST or front-coding + delta-encoding for postings; estimate memory: FSTs handle millions of terms within tens of GB for realistic vocabularies; beyond that use distributed shards or disk-backed structures.
Scoring and deduplication: Attach lightweight metadata (popularity, recency, category) in suggestion payload; score at query time with pre-computed boosts to keep latency low and avoid expensive joins.
Latency targets and resource sizing: Aim for interactive budgets (p50 < 10ms, p95 < 50ms, p99 < 100–200ms) for autosuggest; measure tail latency, tokenization cost, and network hop counts when sizing.
Consistency model: Prefer eventual consistency for search freshness; document SLA for update-to-visible window (e.g., 1–5s NRT or 1–15m batch) and expose it in API SLAs.
Failure modes & backpressure: Protect with circuit-breakers, degrade to static popular completions when indexers lag; avoid serving stale or incomplete suggestions during rolling upgrades.
Operational visibility: Instrument qps, latency (p50/p95/p99), index-lag, memory pressure, garbage-collection, and suggestion coverage; run periodic QA sampling to detect regressions.

Worked example — Design an autocomplete system for Adobe Asset Search with real-time updates and `p99` < 100ms

Frame quickly: clarify scale (assets, distinct tokens, QPS), freshness SLA (seconds vs minutes), and whether suggestions must be personalized. Pillars: (1) a compact FST-based global completion index for low-latency prefix lookup, (2) an append-only ingestion stream that updates a write-ahead Kafka topic and incremental indexer, (3) per-shard in-memory caches for hot prefixes plus fallback ranked results from the inverted index. Key design decisions: choose FST for memory and speed but accept heavier rebuilds for bulk updates; handle single-asset updates by applying them to a small delta index merged periodically to the main FST to avoid full rebuilds. Tradeoff to call out: frequency of merging delta-indexes — more frequent merges improve freshness but increase CPU and transient GC. Close with incremental plan: implement core API + metrics, deploy with canary and traffic split; if more time, add adaptive caching and user-personalization layer.

A second angle — Indexing pipeline for large-scale asset metadata to support full-text search and faceted autocomplete

This variation emphasizes the ingestion pipeline and faceting. Start by defining the metadata schema (title, tags, contributor, MIME, usage counts) and normalize tokenization rules (language analyzers, stopwords, stemming). Use a hybrid approach: use an inverted index with per-field analyzers for full-text and a parallel completion index (FST or edge n-gram) for prefix suggestions; keep facet counts in a separate denormalized store for fast aggregation. For updates, stream events to a small write buffer producing delta segments merged into the main index on a schedule; keep a fast KV layer (e.g., Redis) for hot facet counts. This framing highlights how index design must satisfy both free-text relevance and deterministic facet/autocomplete constraints.

Common pitfalls

Pitfall: Picking edge n-grams for everything because they're easy to implement.
Edge n-grams explode index size and increase CPU for scoring; prefer FSTs or delta-index strategies when memory and tail latency matter.

Pitfall: Designing for average latency only.
Showing p50 = 10ms while p99 = 800ms will fail production; always optimize and measure tail latency, network hops, and GC-induced pauses.

Pitfall: Ignoring update cost and rebuild cadence.
Assuming instant visibility for all updates without a delta-index/merge plan leads to huge CPU spikes or stale results; state your refresh/merge strategy and its cost.

Connections

Autocomplete and indexing naturally pivot to adjacent topics: distributed query routing and shard selection, ranking signal integration (personalization or ML-based re-ranking), and observability for production search (SLOs, synthetic queries, alerting on index-lag).

What's being tested

Core knowledge

Worked example — Design an autocomplete system for Adobe Asset Search with real-time updates and `p99` < 100ms

A second angle — Indexing pipeline for large-scale asset metadata to support full-text search and faceted autocomplete

Common pitfalls

Connections

Further reading

Related concepts

What's being tested

Core knowledge

Worked example — Design an autocomplete system for Adobe Asset Search with real-time updates and p99 < 100ms

A second angle — Indexing pipeline for large-scale asset metadata to support full-text search and faceted autocomplete

Common pitfalls

Connections

Further reading

Related concepts

Worked example — Design an autocomplete system for Adobe Asset Search with real-time updates and `p99` < 100ms