Design a URL shortening service
Company: OpenAI
Role: Software Engineer
Category: System Design
Difficulty: hard
Interview Round: Onsite
Design a scalable backend service for shortening URLs (similar to popular URL shortener products).
Describe a system that supports at least the following core use cases:
- A client sends a long URL and gets back a short URL (e.g., `https://sho.rt/Ab3XyZ`).
- When a user visits the short URL, the service redirects them (HTTP 301/302) to the original long URL.
### Requirements
Clarify and then design for assumptions such as:
**Functional requirements**
- Create a short URL from a given long URL.
- Redirect a short URL to its original long URL.
- (Optional, if time) Support link expiration, custom aliases, and basic click analytics (e.g., click count).
**Non-functional requirements**
- Very low latency for redirection (ideally a few tens of milliseconds at the service layer).
- High availability (e.g., 99.99% for redirection, which is read-heavy).
- Horizontal scalability to handle large traffic, e.g.:
- 100 million new short URLs created per day.
- 1–10 billion redirects per day (reads greatly outnumber writes).
**Other considerations**
- Unique short codes with low collision probability.
- Ability to support billions of total URLs over time.
- Basic security and abuse handling (e.g., rate limiting, spam/malicious URLs).
### Tasks
Explain and justify:
1. **API design**: The main HTTP endpoints (for creating and resolving short URLs) and their request/response formats.
2. **Data model**: How you will store mappings between short codes and long URLs, and any metadata (timestamps, expiration, owner, click count, etc.).
3. **Short code generation**: How to generate compact, unique short codes (e.g., encoding numeric IDs, hashing, or other schemes), and how to deal with collisions.
4. **High-level architecture**: The major components (e.g., load balancers, application servers, databases, caches, etc.) and how requests flow through the system for both creation and redirection.
5. **Scalability**: How to scale storage and traffic:
- Partitioning/sharding strategy for the data store.
- Use of caching (e.g., Redis) to reduce database load for hot URLs.
- Replication and fault tolerance.
6. **Reliability and consistency**: How you ensure that once a short URL is returned to a user, it will reliably work (eventual vs strong consistency, handling partial failures, etc.).
7. **Additional features (if time)**: How you would extend the design to support metrics/analytics, rate limiting, spam detection, link expiration, and custom domains.
Make sure to call out your assumptions, trade-offs (e.g., SQL vs NoSQL, hash-based vs counter-based ID generation), and how the design could evolve as traffic grows.
Quick Answer: This question evaluates a candidate's ability to design scalable, highly available backend services, testing competencies in API design, data modeling, distributed systems, caching, partitioning, and operational trade-offs within the System Design domain of backend engineering.