Design a file upload and scanning report system
Company: Crowdstrike
Role: Software Engineer
Category: System Design
Difficulty: medium
Interview Round: Onsite
Design a system that lets users upload files, scans them, and produces a final report.
### Core workflow
1. User uploads a file.
2. System runs one or more scans (e.g., malware scan, policy/DLP scan, file-type validation).
3. System generates a **report** (structured findings + overall pass/fail + metadata).
4. User can query scan status and download the report.
### Requirements
- Handle large files (up to multiple GB).
- Scanning is asynchronous; user should not have to keep the connection open.
- Support multiple scan engines per file (some may be slow/fail).
- Provide status states: `UPLOADING`, `QUEUED`, `SCANNING`, `COMPLETED`, `FAILED`.
- Secure by default (authn/authz, encryption, least privilege).
- Scalable to high throughput (assume tens of millions of files/day).
### Deliverables
- APIs (or UI flows) you would expose
- High-level architecture and key components
- Data model for file metadata, scan jobs, and reports
- Reliability strategy (retries, idempotency, partial failures)
- Observability and operational considerations
Quick Answer: This question evaluates a candidate's ability to design scalable, secure distributed systems for asynchronous file handling, including API design, data modeling, multi-engine scanning orchestration, reliability, and observability.