PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/Amazon

Design an S3-like object storage service

Last updated: May 13, 2026

Quick Overview

This question evaluates understanding of distributed systems, object storage architecture, REST API design, multipart upload semantics, replication and durability strategies, failure recovery, and scalability concerns.

  • medium
  • Amazon
  • System Design
  • Machine Learning Engineer

Design an S3-like object storage service

Company: Amazon

Role: Machine Learning Engineer

Category: System Design

Difficulty: medium

Interview Round: Onsite

Design a cloud object storage service similar to Amazon S3. The service should allow clients to upload, store, and download large files reliably and efficiently. Focus your design on the following aspects: 1. **API Design** - Define high-level REST APIs for: - Uploading an object (e.g., `PUT /buckets/{bucketId}/objects/{objectKey}`) - Downloading an object (e.g., `GET /buckets/{bucketId}/objects/{objectKey}`) - Optionally listing objects in a bucket. - Consider authentication, basic metadata handling (e.g., size, content-type), and how clients reference objects (buckets and keys). 2. **File Splitting / Multipart Upload** - Large files (e.g., several GBs) should be uploadable in parts. - Explain how you would: - Split files into chunks/parts on the client or server. - Track upload progress and handle retries for failed parts. - Reassemble parts into a final object. - Discuss trade-offs in chunk size and how to ensure consistency and integrity (e.g., checksums). 3. **Backend Storage and Replication** - Design how the service stores object data and metadata: - Object data storage layer (e.g., distributed file system or key-value storage). - Metadata storage (e.g., mapping from bucket/key to physical locations, size, checksums, replication info). - Explain how you will replicate data across multiple machines and data centers to handle: - Machine failures. - Data center outages. - Describe strategies for: - Data durability (e.g., replication factor, erasure coding). - Consistency model (eventual vs strong) for reads after writes. 4. **Failure Handling and Disaster Recovery** - Describe what happens if a data center goes down: - How does the system continue serving reads and writes? - How do you detect failures and route traffic to healthy regions? - Discuss backup, restore, and how you ensure no data loss (or minimal data loss) in catastrophic failures. 5. **Scalability and Performance** - How would you design the system to handle: - Many concurrent uploads/downloads (e.g., millions of QPS)? - Large total storage size (e.g., petabytes or more)? - Explain choices like partitioning/sharding keys, load balancing, and caching. Clearly state assumptions (e.g., target QPS, typical object sizes, durability requirements) and walk through the end-to-end flow of a typical upload and download request.

Quick Answer: This question evaluates understanding of distributed systems, object storage architecture, REST API design, multipart upload semantics, replication and durability strategies, failure recovery, and scalability concerns.

Related Interview Questions

  • Design a cloud database write path and recovery - Amazon (hard)
  • Design a replicated cloud storage service - Amazon (hard)
  • Measure platform success and drive adoption - Amazon (medium)
  • Design multi-tenant ingestion and processing platform - Amazon (medium)
  • Design globally consistent metadata service - Amazon (medium)
Amazon logo
Amazon
Dec 8, 2025, 6:37 PM
Machine Learning Engineer
Onsite
System Design
19
0

Design a cloud object storage service similar to Amazon S3. The service should allow clients to upload, store, and download large files reliably and efficiently.

Focus your design on the following aspects:

  1. API Design
    • Define high-level REST APIs for:
      • Uploading an object (e.g., PUT /buckets/{bucketId}/objects/{objectKey} )
      • Downloading an object (e.g., GET /buckets/{bucketId}/objects/{objectKey} )
      • Optionally listing objects in a bucket.
    • Consider authentication, basic metadata handling (e.g., size, content-type), and how clients reference objects (buckets and keys).
  2. File Splitting / Multipart Upload
    • Large files (e.g., several GBs) should be uploadable in parts.
    • Explain how you would:
      • Split files into chunks/parts on the client or server.
      • Track upload progress and handle retries for failed parts.
      • Reassemble parts into a final object.
    • Discuss trade-offs in chunk size and how to ensure consistency and integrity (e.g., checksums).
  3. Backend Storage and Replication
    • Design how the service stores object data and metadata:
      • Object data storage layer (e.g., distributed file system or key-value storage).
      • Metadata storage (e.g., mapping from bucket/key to physical locations, size, checksums, replication info).
    • Explain how you will replicate data across multiple machines and data centers to handle:
      • Machine failures.
      • Data center outages.
    • Describe strategies for:
      • Data durability (e.g., replication factor, erasure coding).
      • Consistency model (eventual vs strong) for reads after writes.
  4. Failure Handling and Disaster Recovery
    • Describe what happens if a data center goes down:
      • How does the system continue serving reads and writes?
      • How do you detect failures and route traffic to healthy regions?
    • Discuss backup, restore, and how you ensure no data loss (or minimal data loss) in catastrophic failures.
  5. Scalability and Performance
    • How would you design the system to handle:
      • Many concurrent uploads/downloads (e.g., millions of QPS)?
      • Large total storage size (e.g., petabytes or more)?
    • Explain choices like partitioning/sharding keys, load balancing, and caching.

Clearly state assumptions (e.g., target QPS, typical object sizes, durability requirements) and walk through the end-to-end flow of a typical upload and download request.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Amazon•More Machine Learning Engineer•Amazon Machine Learning Engineer•Amazon System Design•Machine Learning Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.