How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Onsite rounds at Datadog.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Datadog during technical interviews.

Implement a Snowflake Query Client | Datadog Coding Question

Implement a Snowflake Query Client

Company: Datadog

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

Build a lightweight query client for a Snowflake-like data warehouse. Implement a `QueryClient` that can: 1. Connect to the warehouse using configuration provided at runtime. 2. Start an asynchronous SQL query with a method such as `startQuery(sql, params)`, returning a unique `queryId`. 3. Check query progress with `getQueryStatus(queryId)`, returning states such as `QUEUED`, `RUNNING`, `SUCCEEDED`, `FAILED`, or `CANCELED`. 4. Retrieve query results after successful completion. 5. Handle common errors such as invalid SQL, missing credentials, network failures, timeout, and unknown query IDs. You may assume a mock warehouse SDK or HTTP API is provided; focus on designing and implementing the client wrapper cleanly. Follow-up questions: - How would you manage secrets such as database credentials securely? - How would you extend the client to support batch query submission and batch status polling?

Quick Answer: This question evaluates API client design, asynchronous query lifecycle management, result retrieval, robust error handling, and secure credential handling for a cloud data warehouse.

Part 1: Simulate an Asynchronous QueryClient

Simulate a lightweight asynchronous query client over a mock warehouse. The client should validate runtime credentials, start queries, poll query status, fetch results, and surface common errors such as invalid SQL, missing credentials, network failures, timeouts, and unknown query IDs. Query IDs must be assigned as q1, q2, q3, ... in start order. For a valid SQL statement, its mock warehouse spec contains a timeline of states returned on successive successful status polls.

Constraints

0 <= len(operations) <= 10^4
1 <= timeout_polls <= 10^6
Each valid SQL in mock_warehouse has a non-empty timeline
network_fail_on uses 1-based indexing over status/result attempts for that query
Credential values are considered missing if the required key is absent or its value is empty

Examples

Input: {'credentials': {'token': 'abc'}, 'warehouse': {'SELECT 1': {'timeline': ['QUEUED', 'RUNNING', 'SUCCEEDED'], 'result': [1]}}, 'actions': [('start', 'SELECT 1'), ('status', 'q1'), ('status', 'q1'), ('status', 'q1'), ('fetch', 'q1')]}

Expected Output: ['q1', 'QUEUED', 'RUNNING', 'SUCCEEDED', [1]]

Explanation: A valid query is started as q1. Three successful status polls walk through the timeline, then fetch returns the stored result.

Input: {'credentials': {}, 'warehouse': {'SELECT 1': {'timeline': ['SUCCEEDED'], 'result': [1]}}, 'actions': [('start', 'SELECT 1'), ('status', 'q1')]}

Expected Output: ['ERROR:MISSING_CREDENTIALS', 'ERROR:MISSING_CREDENTIALS']

Explanation: Credentials are checked before every action, so both actions fail immediately.

Hints

Store per-query state such as current timeline index, number of successful status polls, terminal state, and error reason.
A network failure should not advance the query timeline or poll count.

Part 2: Resolve and Redact Credentials Securely

Build a secure credential resolver for a warehouse client. Each config value can be a literal, an environment-variable reference, or a secret-store reference. Resolve the final config, reject insecure plaintext sensitive values when policy forbids them, detect missing or expired secrets, and produce a redacted copy safe for logs.

Constraints

0 <= len(config) <= 10^4
0 <= len(required_keys) <= 10^4
References only use the prefixes 'env:' and 'secret:'
If any error exists, resolved must be None
safe_log should include all keys from config, and also any missing required keys

Examples

Input: ({'config': {'account': 'acme', 'user': 'alice', 'password': 'SECRET:warehouse_pw'}, 'env': {}, 'secrets': {'warehouse_pw': {'value': 'snowpw', 'expired': False}}, 'sensitive_keys': ['password', 'token'], 'allow_plaintext_sensitive': False},)

Expected Output: {'resolved': {'account': 'acme', 'user': 'alice', 'password': 'snowpw'}, 'safe_log': {'account': 'acme', 'user': 'alice', 'password': '***'}, 'errors': []}

Explanation: account and user are non-sensitive literals, so they are accepted as-is. password resolves from the secret store and is redacted in safe_log.

Input: ({'config': {'account': 'ENV:WH_ACCOUNT', 'user': 'ENV:WH_USER', 'password': 'plainpw', 'token': 'ENV:API_TOKEN'}, 'env': {'WH_ACCOUNT': 'acme', 'WH_USER': 'bob', 'API_TOKEN': 'tkn'}, 'secrets': {}, 'sensitive_keys': ['password', 'token'], 'allow_plaintext_sensitive': False},)

Expected Output: {'resolved': {'account': 'acme', 'user': 'bob', 'token': 'tkn'}, 'safe_log': {'account': 'acme', 'user': 'bob', 'token': '***'}, 'errors': ['insecure plaintext for sensitive key: password']}

Explanation: password is a sensitive literal and plaintext is forbidden, so it is rejected. token comes from env, so it is allowed and then redacted for logs.

Hints

Write a small helper to detect whether a key is sensitive based on its name.
You can resolve values, build safe_log, and collect errors in a single pass over config, then separately check for missing required keys.

Part 3: Batch Query Submission and Batch Status Polling

Simulate an extension of the query client that supports batch query submission and batch status polling. Each submit operation creates a new batch and assigns query IDs. Each poll operation advances only the referenced batch by one cycle, respecting a max_concurrent limit. Queued queries start in original order. A query with steps == 0 finishes immediately when it gets a chance to start.

Constraints

1 <= max_concurrent <= 10^5
Total number of submitted queries across all batches <= 10^5
0 <= steps <= 10^6
Polling advances only the batch named in that operation
Batch IDs are b1, b2, ... and query IDs are q1, q2, ...

Examples

Input: (1, [('submit', [{'steps': 0, 'finalStatus': 'FAILED'}, {'steps': 1, 'finalStatus': 'SUCCEEDED'}, {'steps': 0, 'finalStatus': 'SUCCEEDED'}]), ('poll', 'b1'), ('poll', 'b1'), ('poll', 'b1')])

Expected Output: [{'batchId': 'b1', 'queryIds': ['q1', 'q2', 'q3']}, {'QUEUED': 1, 'RUNNING': 1, 'SUCCEEDED': 0, 'FAILED': 1, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 2, 'FAILED': 1, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 2, 'FAILED': 1, 'CANCELED': 0}]

Explanation: First poll: q1 fails immediately, q2 starts running, q3 remains queued because max_concurrent is 1. Second poll: q2 finishes successfully, then q3 starts and succeeds immediately. Third poll changes nothing.

Input: (2, [('submit', [{'steps': 2, 'finalStatus': 'SUCCEEDED'}, {'steps': 0, 'finalStatus': 'FAILED'}, {'steps': 1, 'finalStatus': 'CANCELED'}]), ('submit', [{'steps': 1, 'finalStatus': 'SUCCEEDED'}]), ('poll', 'b1'), ('poll', 'b2'), ('poll', 'b1'), ('poll', 'b2'), ('poll', 'b1')])

Expected Output: [{'batchId': 'b1', 'queryIds': ['q1', 'q2', 'q3']}, {'batchId': 'b2', 'queryIds': ['q4']}, {'QUEUED': 0, 'RUNNING': 2, 'SUCCEEDED': 0, 'FAILED': 1, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 1, 'SUCCEEDED': 0, 'FAILED': 0, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 1, 'SUCCEEDED': 0, 'FAILED': 1, 'CANCELED': 1}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 1, 'FAILED': 0, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 1, 'FAILED': 1, 'CANCELED': 1}]

Explanation: Each batch advances only when that specific batch is polled. Batch b1 and batch b2 do not affect each other.

Hints

For each batch, track the next queued query index, the set of currently running queries, and a count of each state.
If you update counts incrementally, each poll can avoid recounting the entire batch.

Quick Overview

This question evaluates API client design, asynchronous query lifecycle management, result retrieval, robust error handling, and secure credential handling for a cloud data warehouse.