Design a scalable URL shortener
Company: Microsoft
Role: Data Engineer
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Technical Screen
##### Question
Design a **highly scalable URL shortening service** (like bit.ly / TinyURL) that converts long URLs into short links and supports redirection. Cover the architecture, data model, short-code generation, uniqueness, caching, scaling, and operational concerns below.
1. **Functional requirements**
- Provide an API to **create** a short URL for a given long URL.
- Provide an API to **redirect (resolve)** a short code back to the original long URL.
- The mapping must be **globally unique** (no unintended collisions) and **reversible** (given a short code, you can always retrieve the original URL via lookup).
2. **Non-functional requirements & assumptions**
- High availability and **high QPS**, with reads/redirects typically far outnumbering writes.
- Support **horizontal scaling** across multiple instances/regions.
- Short codes should be as short as possible (e.g., **6–10 characters**) and **URL-safe**.
- Low redirect latency (e.g., **p95 < 50 ms**); the service must tolerate **partial failures**.
- State any other assumptions you need.
3. **APIs**
- Define example endpoints for creating short links and performing redirects (request/response shapes, status codes).
4. **Data model & storage**
- What you store per short link (e.g., `short_code`, `long_url`, `created_at`, `expiry`, `owner_id`).
- Choice of datastore(s): KV store vs relational DB vs both, and why.
5. **Short-code generation strategy**
- **Option A:** hash the long URL.
- **Option B:** generate globally unique IDs (e.g., Snowflake / auto-increment) and **Base62**-encode them.
- How you ensure uniqueness and performance under concurrency.
6. **Guaranteeing global uniqueness** across multiple instances/regions.
7. **Caching & performance**
- Use of in-memory cache (e.g., Redis) and/or CDN for hot links.
- Cache key design, TTL, and cache-miss behavior.
8. **Scalability & reliability**
- Partitioning/sharding strategy.
- Replication, failover, and consistency trade-offs.
9. **Operational concerns**
- Abuse prevention (rate limits), observability/metrics, link expiration, and analytics logging (optional).
10. **Follow-up:** If you use a **hash** of the long URL to generate the short code, **how do you detect and handle hash collisions?** Provide at least two approaches and their trade-offs.
Quick Answer: Design a scalable URL shortener with create and redirect APIs, Base62 short-code generation, global uniqueness, KV storage, caching, CDN optimization, sharding, replication, abuse prevention, observability, expiration, analytics, and hash-collision handling.
Solution
A scalable URL shortener has a write path for link creation and a much hotter read path for redirects. The read path should be simple: short code to long URL lookup, then return a redirect. Anything non-critical, such as analytics aggregation, should be asynchronous.
Example APIs:
```http
POST /api/v1/shorten
Content-Type: application/json
{
"long_url": "https://example.com/some/long/path",
"expiry": "2027-01-01T00:00:00Z",
"custom_alias": "optional"
}
```
Response:
```json
{
"short_url": "https://sho.rt/Ab3x9K",
"short_code": "Ab3x9K",
"long_url": "https://example.com/some/long/path"
}
```
Redirect:
```http
GET /Ab3x9K
302 Found
Location: https://example.com/some/long/path
```
Use `302` if the service wants every redirect to hit the platform for analytics, expiration, or policy checks. Use `301` only when permanent browser and CDN caching is acceptable.
The core data model is:
```text
ShortLink(
short_code primary key,
long_url,
owner_id,
created_at,
expires_at,
status,
created_by_ip
)
```
The access pattern is a point lookup by `short_code`, so a KV store or wide-column store such as DynamoDB or Cassandra is a good source of truth at large scale. A relational database can handle accounts, billing, and owner-facing list views. Shard the mapping table by a hash of `short_code` so traffic spreads evenly. Replicate across availability zones; reads can use replicas because mappings are mostly immutable after creation.
For code generation, I would prefer globally unique ID plus Base62 encoding. Generate a 64-bit ID using a Snowflake-style layout with timestamp, region or machine ID, and sequence number, then Base62 encode it. This is collision-free by construction if machine IDs or ID ranges are unique. A 7-character Base62 code has about `62^7`, or roughly 3.5 trillion, possibilities. The database should still enforce uniqueness with a conditional insert or unique constraint as the final guard.
Hashing the long URL is another option. It is deterministic and can dedupe identical URLs, but truncating the hash introduces collisions. If using hashes, handle collisions with conditional insert. If the generated code already exists for the same long URL, return the existing link. If it exists for a different URL, retry with a salt or nonce, or take more hash characters. The tradeoff is extra reads/writes and potentially longer codes. At high scale, unique ID plus Base62 is usually cleaner.
The redirect path should be optimized with cache-aside:
1. Read `short_code` from Redis or another fast cache.
2. On hit, redirect immediately.
3. On miss, read from the mapping store, backfill cache, and redirect.
4. Cache negative lookups briefly to reduce scanning abuse.
TTL should not exceed link expiry. For very hot links, use CDN or edge caching when analytics requirements allow it. To avoid cache stampedes, use request coalescing or single-flight on hot misses.
For multi-region active-active writes, partition ID space by region or include a region ID in the Snowflake machine component. Avoid globally synchronized counters on the hot path. Replicate mappings asynchronously, accepting that a newly created link may take a short time to become readable everywhere unless the product requires stronger consistency.
Operational concerns:
- Rate limit creation by user, IP, and account.
- Check links for malware, phishing, and policy violations.
- Expire links through TTL or background sweepers.
- Log redirect events asynchronously through a queue or stream.
- Track QPS, p95/p99 redirect latency, cache hit rate, error rate, DB latency, link creation failures, abuse blocks, and redirect success.
- If analytics is down, redirects should still work.
- If cache is down, fall back to the database with rate protection.
The strong design summary is: use globally unique IDs encoded as Base62, store `short_code -> long_url` in a sharded replicated mapping store, make redirects cache-heavy and low latency, keep analytics asynchronous, and use conditional writes, rate limits, observability, and abuse checks to keep the system reliable.
Explanation
Rubric: a strong answer nails the read-heavy nature of the workload (cache/CDN the redirect path), picks an appropriate datastore for point lookups, and — critically — chooses unique-ID + Base62 over URL-hashing to make codes collision-free by construction, while still being able to articulate hash-collision handling for the follow-up. Senior signals: disjoint ID ranges / Snowflake for cross-instance uniqueness without hot-path coordination, conditional inserts as the datastore-level uniqueness guard, 301-vs-302 trade-off, eventual-consistency tolerance on reads, and graceful degradation under partial failure.