URL Shortener System Design
Asked of: Software Engineer
Last updated

What's being tested
This tests whether you can design a read-heavy distributed service with low latency, high availability, and clean failure handling. The interviewer is probing for practical decisions around API design, key generation, collision avoidance, storage modeling, caching, TTL semantics, and how the system behaves under traffic spikes or partial failures. ByteDance cares because link resolution resembles many high-QPS backend paths: a tiny request, a strict `p99` latency target, global traffic, abuse risk, and correctness requirements that are simple to state but subtle at scale. A strong SWE answer should move from requirements to capacity estimates to architecture, then defend tradeoffs rather than listing components.
Core knowledge
-
Functional requirements should be explicit: create short URL, redirect short URL, optionally support custom aliases, expiration, deletion, and analytics counters. The core path is
`POST /urls`for creation and`GET /{code}`returning`301`or`302`redirect. -
Non-functional requirements usually dominate: reads may exceed writes by or more, redirects need low
`p99`latency, created links must be durable, and duplicate short codes must not map to different long URLs. Availability is often prioritized over rich analytics on the redirect path. -
Capacity estimation guides design choices. For 100M new URLs/day over 5 years, total records are roughly . With Base62 encoding, code space is ; 7 chars gives about combinations, while 8 chars gives about .
-
Key generation is the central design decision. Common options are auto-increment IDs encoded as Base62, Snowflake-style IDs, random codes with collision retries, or pre-generated key pools. Auto-increment is simple but centralizes allocation; random is easy to distribute but needs collision handling.
-
Base62 uses
`0-9`,`a-z`,`A-Z`to create compact URL-safe strings. If using numeric IDs, encode the integer into Base62. If using random codes, compute collision probability with the birthday effect; as utilization rises, retry rate increases, so keep code space much larger than expected records. -
Collision handling must be deterministic and safe. For random code generation, enforce a unique index on
`short_code`in`MySQL`,`Postgres`,`DynamoDB`, or`Cassandra`-like storage, then retry on conflict. Do not rely only on an in-memory “already used” check across distributed workers. -
Storage schema can start with
`short_code`,`long_url`,`created_at`,`expires_at`,`owner_id`, and`status`. The read path is a point lookup by`short_code`, so the primary key should be`short_code`. Secondary access by user or creation time belongs off the redirect hot path. -
Caching is critical because redirects are read-heavy. Use
`Redis`,`Memcached`, or CDN/edge caching for`short_code -> long_url`. Cache positive results with TTL and consider negative caching for missing or expired codes to protect the database from repeated invalid requests. -
Redirect semantics matter.
`301`is permanent and may be cached aggressively by browsers and intermediaries;`302`or`307`is safer if links can expire, be edited, disabled, or measured. For most interview designs, choose`302`unless immutability is a stated requirement. -
Expiration and deletion require clear semantics. If
`expires_at < now`, return`404`or`410 Gone`; keep the record for audit/abuse handling or asynchronously purge it. If caching is used, cache TTL must not exceed remaining link lifetime:`cache_ttl = min(default_ttl, expires_at - now)`. -
High availability is achieved by stateless API servers behind a load balancer, replicated storage, and multi-AZ deployment. On read, if cache misses and the primary datastore is unavailable, decide whether to fail closed with
`503`or use a stale cache value depending on correctness expectations. -
Hot keys and abuse are realistic edge cases. A celebrity or campaign link can create a hot key; cache replication, CDN edge caching, and request coalescing help. Also validate long URLs, apply rate limits, block malware/phishing domains, and avoid open redirect surprises in internal services.
Worked example
For Design a TinyURL-like short link service, start by clarifying: expected read/write QPS, whether custom aliases are required, whether links expire, whether URLs are mutable, and whether global availability is expected. Then declare assumptions such as “reads are 100x writes, generated links are immutable, expiration is optional, and redirect `p99` should be under 50 ms excluding external network time.” Organize the answer around four pillars: API contract, key generation, storage/cache design, and reliability/edge cases. For APIs, propose `POST /v1/shorten` with `long_url`, optional `expires_at`, and optional `custom_alias`, and `GET /{code}` for redirect. For ID generation, choose either Snowflake-style numeric IDs encoded with Base62 or random 8-character Base62 codes with a unique constraint and retry loop; explain why the chosen approach fits the scale. For storage, use a durable key-value or wide-column store keyed by `short_code`, with `Redis` in front for the hot redirect path. For TTL, check `expires_at` on both cache population and database reads, and cap cache TTL to the remaining lifetime. One tradeoff to flag explicitly is sequential Base62 IDs versus random codes: sequential IDs are collision-free but predictable, while random codes are less guessable but require collision retries. Close by saying that with more time you would cover multi-region routing, abuse detection, rate limiting, observability, and operational playbooks for cache or datastore outages.
A second angle
If the interviewer changes the framing to emphasize custom aliases, the design shifts from pure generation to conflict management and authorization. A custom alias like `/tiktok-sale` must be checked with an atomic conditional write, not with a separate read-then-write race. If the framing emphasizes expiration, then TTL correctness becomes central: cache entries must expire no later than the link, and expired links should consistently return `404` or `410`. If the framing emphasizes global scale, the hardest part becomes ID allocation and replication: Snowflake-style IDs avoid coordination, while a single auto-increment database becomes a bottleneck and a single-region availability risk. The same core design applies, but the “best” choice depends on whether the constraint is uniqueness, latency, customizability, or multi-region resilience.
Common pitfalls
Pitfall: Designing only the happy path.
A common weak answer is “store long URL in database, generate hash, redirect,” with no discussion of collisions, expiration, cache invalidation, or datastore failure. A stronger answer names the exact invariant: one `short_code` must never resolve to two different destinations, and every write path must preserve that invariant under concurrency.
Pitfall: Using a hash of the long URL without explaining consequences.
Hashing the long URL can make duplicate long URLs map to the same code, but collisions still exist and identical URLs from different users may need different expiration, ownership, or analytics. If you propose hashing, add a collision suffix or unique constraint strategy, and clarify whether deduplication is desired.
Pitfall: Over-indexing on exotic infrastructure before requirements.
Jumping immediately to `Kafka`, global consensus, or complex sharding can sound unfocused for a simple redirect service. Start with the read/write ratio, data volume, and latency target; then introduce cache, replication, sharding, or multi-region only when the numbers or availability goal justify them.
Connections
Interviewers may pivot from this design to distributed ID generation, cache consistency, rate limiting, or database sharding. They may also ask about adjacent read-heavy systems such as pastebins, object metadata stores, feature flag services, or CDN-backed redirect services. Be ready to explain the same tradeoffs in terms of correctness invariants, latency, and operational failure modes.
Further reading
-
Designing Data-Intensive Applications — excellent grounding for replication, partitioning, consistency, and storage tradeoffs.
-
Dynamo: Amazon’s Highly Available Key-value Store — useful for understanding highly available key-value storage under failure.
-
Twitter Snowflake — classic reference for distributed, time-sortable unique ID generation.
Featured in interview prep guides
Practice questions
Related concepts
- RESTful API And HTTP Service DesignSoftware Engineering Fundamentals
- Web Crawlers, URL Normalization, And PolitenessSystem Design
- Storage, Indexing, APIs, And Secure ExecutionSystem Design
- Resilient API Aggregation And Operational DebuggingSoftware Engineering Fundamentals
- Production System Design TradeoffsSystem Design
- Scalable Service And Distributed System DesignSystem Design