PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCareers
|Home/Software Engineering Fundamentals/Scale AI

Design CSV upload endpoint with GPT classification

Last updated: Apr 22, 2026

Quick Overview

This question evaluates backend engineering skills including HTTP API design, multipart file handling, CSV parsing and serialization, local JSON persistence, and integration with an external GPT-style classification API.

  • medium
  • Scale AI
  • Software Engineering Fundamentals
  • Software Engineer

Design CSV upload endpoint with GPT classification

Company: Scale AI

Role: Software Engineer

Category: Software Engineering Fundamentals

Difficulty: medium

Interview Round: Onsite

You are building a backend service that needs to process two CSV files and then call an external GPT-like API for classification. **Requirements** 1. **HTTP Endpoint** - Expose an HTTP endpoint, e.g. `POST /ingest-data`. - The client uploads two CSV files in a single request: - `users.csv` - `tasks.csv` - A typical row in `users.csv` might be: `user_id,name,email`. - A typical row in `tasks.csv` might be: `task_id,user_id,description`. 2. **CSV Parsing and Local JSON Storage** - The endpoint should: - Receive the two CSV files. - Parse them into in-memory data structures (e.g., lists of objects). - Serialize each dataset into JSON. - Persist the resulting JSON to the local filesystem (e.g., `users.json`, `tasks.json`). 3. **GPT Classification Step** - After parsing, the service should call an external GPT-like API to classify **one field** in the JSON data. For example: - For each task in `tasks.json`, classify the `description` into one of a small set of categories (e.g., `"bug"`, `"feature"`, `"documentation"`). - The GPT API: - Is accessed via HTTPS. - Takes a text prompt and returns a classification label in JSON. - You are free to design the prompt and to decide whether to call the GPT API per-record or in batches, as long as all tasks end up with a classification label. 4. **Response** - After classification, return an HTTP response that includes at least: - A success indicator. - Basic stats (e.g., number of users, number of tasks processed). - Optionally, the enriched `tasks` data with the new classification field. 5. **Non-functional Requirements** - Handle basic validation and error cases (missing file, malformed CSV, GPT API failure). - Assume multiple clients may call this endpoint concurrently. - The solution should be reasonably testable. **Task** Describe how you would design and implement this endpoint, including: - The HTTP API contract (request format, response format). - How you handle file uploads and CSV parsing. - How you structure the code to write JSON to local storage. - How you integrate with the GPT classification API (including error handling and possible batching). - Considerations for concurrency, timeouts, and testing.

Quick Answer: This question evaluates backend engineering skills including HTTP API design, multipart file handling, CSV parsing and serialization, local JSON persistence, and integration with an external GPT-style classification API.

Related Interview Questions

  • Debug a Project Assignment Codebase - Scale AI (medium)
  • Explain worker state machine load balancer design - Scale AI (medium)
Scale AI logo
Scale AI
Dec 8, 2025, 7:32 PM
Software Engineer
Onsite
Software Engineering Fundamentals
43
0
Loading...

You are building a backend service that needs to process two CSV files and then call an external GPT-like API for classification.

Requirements

  1. HTTP Endpoint
    • Expose an HTTP endpoint, e.g. POST /ingest-data .
    • The client uploads two CSV files in a single request:
      • users.csv
      • tasks.csv
    • A typical row in users.csv might be: user_id,name,email .
    • A typical row in tasks.csv might be: task_id,user_id,description .
  2. CSV Parsing and Local JSON Storage
    • The endpoint should:
      • Receive the two CSV files.
      • Parse them into in-memory data structures (e.g., lists of objects).
      • Serialize each dataset into JSON.
      • Persist the resulting JSON to the local filesystem (e.g., users.json , tasks.json ).
  3. GPT Classification Step
    • After parsing, the service should call an external GPT-like API to classify one field in the JSON data. For example:
      • For each task in tasks.json , classify the description into one of a small set of categories (e.g., "bug" , "feature" , "documentation" ).
    • The GPT API:
      • Is accessed via HTTPS.
      • Takes a text prompt and returns a classification label in JSON.
    • You are free to design the prompt and to decide whether to call the GPT API per-record or in batches, as long as all tasks end up with a classification label.
  4. Response
    • After classification, return an HTTP response that includes at least:
      • A success indicator.
      • Basic stats (e.g., number of users, number of tasks processed).
      • Optionally, the enriched tasks data with the new classification field.
  5. Non-functional Requirements
    • Handle basic validation and error cases (missing file, malformed CSV, GPT API failure).
    • Assume multiple clients may call this endpoint concurrently.
    • The solution should be reasonably testable.

Task

Describe how you would design and implement this endpoint, including:

  • The HTTP API contract (request format, response format).
  • How you handle file uploads and CSV parsing.
  • How you structure the code to write JSON to local storage.
  • How you integrate with the GPT classification API (including error handling and possible batching).
  • Considerations for concurrency, timeouts, and testing.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Software Engineering Fundamentals•More Scale AI•More Software Engineer•Scale AI Software Engineer•Scale AI Software Engineering Fundamentals•Software Engineer Software Engineering Fundamentals
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • Careers
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.