Design a scalable backend service for shortening URLs (similar to popular URL shortener products).
Describe a system that supports at least the following core use cases:
-
A client sends a long URL and gets back a short URL (e.g.,
https://sho.rt/Ab3XyZ
).
-
When a user visits the short URL, the service redirects them (HTTP 301/302) to the original long URL.
Requirements
Clarify and then design for assumptions such as:
Functional requirements
-
Create a short URL from a given long URL.
-
Redirect a short URL to its original long URL.
-
(Optional, if time) Support link expiration, custom aliases, and basic click analytics (e.g., click count).
Non-functional requirements
-
Very low latency for redirection (ideally a few tens of milliseconds at the service layer).
-
High availability (e.g., 99.99% for redirection, which is read-heavy).
-
Horizontal scalability to handle large traffic, e.g.:
-
100 million new short URLs created per day.
-
1–10 billion redirects per day (reads greatly outnumber writes).
Other considerations
-
Unique short codes with low collision probability.
-
Ability to support billions of total URLs over time.
-
Basic security and abuse handling (e.g., rate limiting, spam/malicious URLs).
Tasks
Explain and justify:
-
API design
: The main HTTP endpoints (for creating and resolving short URLs) and their request/response formats.
-
Data model
: How you will store mappings between short codes and long URLs, and any metadata (timestamps, expiration, owner, click count, etc.).
-
Short code generation
: How to generate compact, unique short codes (e.g., encoding numeric IDs, hashing, or other schemes), and how to deal with collisions.
-
High-level architecture
: The major components (e.g., load balancers, application servers, databases, caches, etc.) and how requests flow through the system for both creation and redirection.
-
Scalability
: How to scale storage and traffic:
-
Partitioning/sharding strategy for the data store.
-
Use of caching (e.g., Redis) to reduce database load for hot URLs.
-
Replication and fault tolerance.
-
Reliability and consistency
: How you ensure that once a short URL is returned to a user, it will reliably work (eventual vs strong consistency, handling partial failures, etc.).
-
Additional features (if time)
: How you would extend the design to support metrics/analytics, rate limiting, spam detection, link expiration, and custom domains.
Make sure to call out your assumptions, trade-offs (e.g., SQL vs NoSQL, hash-based vs counter-based ID generation), and how the design could evolve as traffic grows.