Design a backend service that determines whether a given URL is malicious.
You have access to an external dependency:
-
isMalicious(url) -> bool
(black-box API)
This external API is:
-
Rate-limited
(e.g., 1k QPS max) and sometimes slow (p95 500ms) or unavailable.
-
Potentially
costly
per call.
Your service should expose an internal API such as:
-
GET /reputation?url=...
→ returns
{ verdict: "malicious"|"benign"|"unknown", checkedAt, confidence }
Requirements
-
Low latency for repeated queries.
-
Must handle high traffic (e.g., 50k QPS reads) while respecting the external API limit.
-
Avoid repeatedly calling
isMalicious
for the same URL.
-
Provide reasonable behavior for timeouts/outages (no cascading failures).
-
Support background re-checking because reputations can change.
Explain:
-
APIs and key workflows (read path, cache miss path, async refresh)
-
Data model and storage choices
-
Caching strategy, TTLs, and deduplication of concurrent requests
-
Rate limiting / backpressure / circuit breaker strategy
-
Scaling, partitioning, and reliability
-
Metrics and operational concerns