PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches

Quick Overview

This question evaluates proficiency in HTTP client communication with token-based authentication, parsing structured responses into data models, relational database interaction including handling duplicate records, configuration management, and robust error handling and testing.

  • medium
  • Coreweave
  • Coding & Algorithms
  • Software Engineer

Implement web data fetch and storage tool

Company: Coreweave

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

### Problem You are asked to implement a small program that retrieves data from a remote web service protected by token-based authentication, parses the response, and stores the parsed data into a database. ### Requirements 1. **Remote request with token-based authentication** - The program should send an HTTP(S) request to a given URL (e.g., provided via a config file or command-line argument). - The remote service uses **token-based authentication** (e.g., a bearer token). - The program should attach the given token to the request in the appropriate HTTP header (for example, `Authorization: Bearer <token>`). 2. **Parse response content** - Assume the remote service returns a structured response (e.g., JSON) containing a list of items. - Each item has a few fields such as an `id`, a `timestamp`, and a `message` (you can assume reasonable names and types if not specified exactly). - The program should parse the response body and extract these fields into in-memory objects/records. 3. **Store into a database** - The program should connect to a relational database (e.g., PostgreSQL or MySQL) using a connection string or config. - Create (or assume the existence of) a table with appropriate columns to store the parsed fields (e.g., `id`, `timestamp`, `message`). - Insert the parsed records into the table. - Handle duplicates in a sensible way (for example, avoid inserting the same `id` more than once, or perform an upsert). 4. **Error handling and robustness** - Handle common errors such as: - Network failures or timeouts when calling the remote service. - Non-2xx HTTP status codes. - Invalid or unexpected response formats. - Database connection or insertion errors. - Log errors or print meaningful messages so that a user or operator can understand what went wrong. 5. **Execution interface** - The program can be a command-line tool. - It should accept at least: - The remote service URL. - The authentication token. - Database connection information. Describe how you would design and implement this program, including: - How you would structure the code (e.g., separation between HTTP client, parser, and database layer). - How you would manage configuration (URL, token, DB credentials). - How you would test it (unit tests, integration tests, mocking the remote service and DB).

Quick Answer: This question evaluates proficiency in HTTP client communication with token-based authentication, parsing structured responses into data models, relational database interaction including handling duplicate records, configuration management, and robust error handling and testing.

In this simplified coding version of a web fetch-and-store tool, actual HTTP calls and database operations are simulated with Python data structures. You are given a provided authentication token, the expected token required by the remote service, a list of fetch attempts, and a list of records already stored in the database. Each fetch attempt is a dictionary with keys 'status' and 'body'. A successful response must have a 2xx status code, a dictionary body, and an 'items' field containing a list. Process attempts in order and stop at the first successful, well-formed response. Then parse its items and upsert them into the database. Each valid item must be a dictionary containing an integer 'id', an integer 'timestamp', and a string 'message'. Use 'id' as the primary key. If an incoming item has a newer or equal timestamp than the stored record, overwrite the stored record; otherwise ignore it. Invalid items are ignored. Return a summary of the operation including the final stored records sorted by id.

Constraints

  • 0 <= len(attempts) <= 10^5
  • 0 <= len(existing_records) <= 10^5
  • The total number of items inside the chosen response is at most 10^5
  • IDs and timestamps are integers in the range [-10^9, 10^9]
  • Process fetch attempts in the given order and stop at the first valid 2xx response with a body of the form {'items': [...]}

Examples

Input: ('secret', 'secret', [{'status': 500, 'body': {'items': []}}, {'status': 200, 'body': {'items': [{'id': 1, 'timestamp': 120, 'message': 'new'}, {'id': 2, 'timestamp': 90, 'message': 'hello'}, {'id': 2, 'timestamp': 80, 'message': 'stale'}, {'id': 'x', 'timestamp': 5, 'message': 'bad'}, {'id': 3, 'timestamp': 50, 'message': 'same'}]}}], [(1, 100, 'old'), (3, 50, 'keep')])

Expected Output: {'result': 'ok', 'records': [(1, 120, 'new'), (2, 90, 'hello'), (3, 50, 'same')], 'inserted': 1, 'updated': 2, 'ignored': 2, 'errors': 1}

Explanation: The first attempt fails with HTTP 500, so errors becomes 1. The second attempt is valid. Record 1 is updated to timestamp 120, record 2 is inserted, the stale duplicate for record 2 is ignored, the malformed item with id='x' is ignored, and record 3 is overwritten because equal timestamps are allowed to replace the stored message.

Input: ('wrong', 'secret', [{'status': 200, 'body': {'items': [{'id': 2, 'timestamp': 1, 'message': 'ignored'}]}}], [(1, 10, 'a')])

Expected Output: {'result': 'auth_error', 'records': [(1, 10, 'a')], 'inserted': 0, 'updated': 0, 'ignored': 0, 'errors': 1}

Explanation: Authentication fails before any fetch attempt is processed, so the database remains unchanged.

Input: ('secret', 'secret', [{'status': None, 'body': None}, {'status': 403, 'body': {}}, {'status': 200, 'body': ['bad']}], [])

Expected Output: {'result': 'fetch_error', 'records': [], 'inserted': 0, 'updated': 0, 'ignored': 0, 'errors': 3}

Explanation: The first attempt simulates a network failure, the second is a non-2xx response, and the third has an invalid body format. No usable response is found.

Input: ('token', 'token', [{'status': 200, 'body': {'items': []}}], [(2, 5, 'x')])

Expected Output: {'result': 'ok', 'records': [(2, 5, 'x')], 'inserted': 0, 'updated': 0, 'ignored': 0, 'errors': 0}

Explanation: A valid response is found immediately, but it contains no items, so the stored data stays the same.

Input: ('abc', 'abc', [], [(5, 7, 'saved')])

Expected Output: {'result': 'fetch_error', 'records': [(5, 7, 'saved')], 'inserted': 0, 'updated': 0, 'ignored': 0, 'errors': 0}

Explanation: There are no fetch attempts at all, so no valid response can be chosen and the database remains unchanged.

Solution

def solution(provided_token, expected_token, attempts, existing_records):
    def snapshot(db):
        return sorted((record_id, ts, message) for record_id, (ts, message) in db.items())

    db = {}
    for record_id, ts, message in existing_records:
        db[record_id] = (ts, message)

    if provided_token != expected_token:
        return {
            'result': 'auth_error',
            'records': snapshot(db),
            'inserted': 0,
            'updated': 0,
            'ignored': 0,
            'errors': 1
        }

    errors = 0
    items = None

    for attempt in attempts:
        if not isinstance(attempt, dict):
            errors += 1
            continue

        status = attempt.get('status')
        body = attempt.get('body')

        if type(status) is not int or not (200 <= status <= 299):
            errors += 1
            continue

        if not isinstance(body, dict):
            errors += 1
            continue

        body_items = body.get('items')
        if not isinstance(body_items, list):
            errors += 1
            continue

        items = body_items
        break

    if items is None:
        return {
            'result': 'fetch_error',
            'records': snapshot(db),
            'inserted': 0,
            'updated': 0,
            'ignored': 0,
            'errors': errors
        }

    inserted = 0
    updated = 0
    ignored = 0

    for item in items:
        if not isinstance(item, dict):
            ignored += 1
            continue

        record_id = item.get('id')
        ts = item.get('timestamp')
        message = item.get('message')

        if type(record_id) is not int or type(ts) is not int or not isinstance(message, str):
            ignored += 1
            continue

        if record_id not in db:
            db[record_id] = (ts, message)
            inserted += 1
        else:
            current_ts, _ = db[record_id]
            if ts >= current_ts:
                db[record_id] = (ts, message)
                updated += 1
            else:
                ignored += 1

    return {
        'result': 'ok',
        'records': snapshot(db),
        'inserted': inserted,
        'updated': updated,
        'ignored': ignored,
        'errors': errors
    }

Time complexity: O(E + A + K), where E is the number of existing records, A is the number of fetch attempts scanned until a valid one is found (or all attempts if none is valid), and K is the number of items in the chosen response. Space complexity: O(E), for the in-memory database map.

Hints

  1. A hash map keyed by record id makes database upserts efficient.
  2. Separate the problem into three phases: authenticate, find the first usable response, then parse and apply item updates.
Last updated: May 6, 2026

Related Coding Questions

  • Query Machines and Mark Them Offline - Coreweave (medium)

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.