Design a disposable email service that issues auto-expiring addresses (e.g., 10-minute inboxes) and receives messages. Specify: requirements and APIs, address generation and TTL semantics, SMTP ingress (MX records, inbound MTA), spam/abuse controls, storage schema with per-message TTL and indexing, retrieval via Web UI and REST, rate limiting and quota, privacy/security (isolation, link-safety, attachment handling), compliance and deletion, background cleanup, observability, and capacity planning. Provide a high-level architecture (MTA → queue/stream → message processor → durable store → cache), data model, consistency/availability trade-offs, scaling to millions of inboxes/day, and cost controls.
Quick Answer: This question evaluates a candidate's ability to design scalable, highly available, and privacy-focused email ingestion systems, probing competencies in distributed systems, networking (SMTP/MX), storage lifecycle management, rate limiting, security/privacy, and observability.
Design a Disposable Email Service with Auto-Expiring Addresses
Design a receive-only disposable email service (e.g., 10‑minute inboxes) that issues auto-expiring addresses and lets users read incoming messages through a web UI and a REST API.
A user requests a throwaway address, hands it out to receive one or more messages (such as a sign-up confirmation or one-time code), reads what arrives, and the address — along with its messages — expires shortly after. The service receives mail only; it does not send.
Provide a clear, high-level design. State your assumptions, then work through the requirements, APIs, architecture, data model, and the key trade-offs.
Constraints & Assumptions
Anchor the design to these targets (call out any you'd renegotiate with the interviewer):
Scale:
millions of inboxes created per day; millions of inbound messages per day (assume on the order of
10M messages/day
for capacity math).
Default inbox TTL:
~10 minutes (600s), with a configurable cap (e.g., 1 hour).
Latency SLA:p95 < 2–5 s
from SMTP accept to message visible in the UI/API.
Availability:
SMTP ingress and reads are high-availability; eventual consistency is acceptable for message indexing.
Message size:
assume a per-message cap (e.g., ~100 KB average, 10 MB max) and a small attachment count.
Out of scope (state this explicitly):
sending mail / outbound MTA, long-term archival, cross-inbox search/threading.
Core Requirements
Functional
Create
ephemeral email addresses that auto-expire (e.g., after 10 minutes).
Receive
inbound email over SMTP and surface new messages in
near real time
.
List, read, and delete
messages via the Web UI and REST.
Enforce
per-inbox
and
per-user/IP
quotas and rate limits.
Non-Functional
High availability
for SMTP ingress and reads; eventual consistency is acceptable for message indexing.
Low-latency
display of new messages:
p95 < 2–5 seconds
from SMTP accept to message visible.
Scale
to millions of inboxes per day and millions of messages per day.
Strict privacy/security
and
compliant deletion
when an address's TTL expires.
Cost efficiency
via storage lifecycle management.
Clarifying Questions to Ask
Before designing, scope the problem with the interviewer:
Is the service truly
receive-only
, or do we ever need to send (replies, bounces, outbound MTA)?
What's the
read:write ratio
and how many times is a typical inbox polled before it expires? (Drives caching and the real-time push design.)
How
strict is the deletion/privacy guarantee
— does "expired" mean
invisible within seconds
,
physically purged within seconds
, or
eventually purged
? (These have very different costs.)
Do we need
attachments
at all, and if so, what handling is required (AV scanning, type allow-lists, previews)?
Are there
compliance regimes
(GDPR right-to-erasure, data residency) that bound retention and where data may live?
What's the expected
abuse profile
— bots mass-creating inboxes and scraping, or mostly legitimate one-off OTP/sign-up traffic?
What to Cover
Walk through each of the following:
Requirements and public API
— endpoints to create an inbox, list messages, get a message, and delete.
Address generation and TTL semantics
— how addresses are minted and how expiry is determined and enforced.
SMTP ingress
— MX records, inbound MTA behavior, and the acceptance policy (when to accept vs. reject mail).
Spam and abuse controls.
Storage schema
— per-message TTL and the indexing strategy.
Retrieval (Web UI and REST)
— including the auth model (e.g., capability tokens).
Rate limiting and quotas.
Privacy and security
— inbox isolation, link safety, and attachment handling.
Compliance and deletion guarantees.
Background cleanup
processes.
Observability
— metrics, logs, tracing, and alerts.
Capacity planning and cost controls.
Architecture, Data, and Trade-offs
Be ready to discuss:
High-level architecture
, roughly:
MTA → queue/stream → message processor → durable store → cache → API/UI.
Data model
— inbox, message, attachment, and the indexing entities.
Consistency vs. availability
trade-offs for SMTP accept, message availability, and deletion.
Scaling strategies
to handle millions of inboxes and messages per day.
Cost control
mechanisms — storage lifecycle, attachment limits, and rate limiting.
What a Strong Answer Covers Premium
Follow-up Questions
Be ready for deeper probes after the main design:
What breaks first
if inbound volume jumps 100×? Walk through the bottleneck order (ingress → stream → processors → storage → deletion).
A user complains they could
still read mail seconds after the inbox expired
— which layer failed, and why might native store TTL
not
be the one delivering your privacy guarantee?
How do you make message
processing idempotent
under stream redelivery, and what exactly must the message ID
not
depend on?
A regulator demands proof of
right-to-erasure
: what's your precise deletion SLA, and how do you evidence it without logging message content?
How does the design change if you must now
send
mail (replies) as well as receive?