Design a system that lets users upload files, scans them, and produces a final report.
Core workflow
-
User uploads a file.
-
System runs one or more scans (e.g., malware scan, policy/DLP scan, file-type validation).
-
System generates a
report
(structured findings + overall pass/fail + metadata).
-
User can query scan status and download the report.
Requirements
-
Handle large files (up to multiple GB).
-
Scanning is asynchronous; user should not have to keep the connection open.
-
Support multiple scan engines per file (some may be slow/fail).
-
Provide status states:
UPLOADING
,
QUEUED
,
SCANNING
,
COMPLETED
,
FAILED
.
-
Secure by default (authn/authz, encryption, least privilege).
-
Scalable to high throughput (assume tens of millions of files/day).
Deliverables
-
APIs (or UI flows) you would expose
-
High-level architecture and key components
-
Data model for file metadata, scan jobs, and reports
-
Reliability strategy (retries, idempotency, partial failures)
-
Observability and operational considerations