Design a scalable URL shortener
Company: Microsoft
Role: Data Engineer
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Technical Screen
## System Design: Scalable URL Shortener
Design a **highly scalable URL shortener** that converts long URLs into short links and supports redirection.
### Core requirements
- **Create** a short URL for a given long URL.
- **Redirect**: when a user visits the short URL, redirect to the original long URL.
- Short codes must be:
- **Globally unique**.
- **Reversible** in the sense that the system can always retrieve the original URL from the short code (via lookup).
- The system should be **high availability** and handle **high QPS** (reads/redirects typically much higher than writes).
### Design topics to cover
1. **APIs**
- Example endpoints for creating short links and performing redirects.
2. **Data model & storage**
- What data you store for each short link (e.g., `short_code`, `long_url`, `created_at`, `expiry`, `owner_id`, etc.).
- Choice of datastore(s): KV store / relational DB / both, and why.
3. **Short code generation strategy**
- Hash-based vs ID-based (e.g., Snowflake/auto-increment) and **Base62** encoding.
- How you ensure **uniqueness** and performance under concurrency.
4. **Caching & performance**
- Use of in-memory cache (e.g., Redis) and/or CDN for hot links.
- Cache key design, TTL, and cache-miss behavior.
5. **Scalability & reliability**
- Partitioning/sharding strategy.
- Handling replication, failover, and consistency trade-offs.
6. **Operational concerns**
- Abuse prevention (rate limits), observability/metrics, link expiration, and analytics logging (optional).
### Follow-up
- If you use a **hash** of the long URL to generate the short code, **how do you handle hash collisions**?
- Provide at least two approaches and their trade-offs.
Quick Answer: This question evaluates skills in designing scalable backend services, including distributed system architecture, data modeling, API design, short-code generation strategies, caching, and operational trade-offs like availability and consistency.