PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates API client design, asynchronous query lifecycle management, result retrieval, robust error handling, and secure credential handling for a cloud data warehouse.

  • medium
  • Datadog
  • Coding & Algorithms
  • Software Engineer

Implement a Snowflake Query Client

Company: Datadog

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

Build a lightweight query client for a Snowflake-like data warehouse. Implement a `QueryClient` that can: 1. Connect to the warehouse using configuration provided at runtime. 2. Start an asynchronous SQL query with a method such as `startQuery(sql, params)`, returning a unique `queryId`. 3. Check query progress with `getQueryStatus(queryId)`, returning states such as `QUEUED`, `RUNNING`, `SUCCEEDED`, `FAILED`, or `CANCELED`. 4. Retrieve query results after successful completion. 5. Handle common errors such as invalid SQL, missing credentials, network failures, timeout, and unknown query IDs. You may assume a mock warehouse SDK or HTTP API is provided; focus on designing and implementing the client wrapper cleanly. Follow-up questions: - How would you manage secrets such as database credentials securely? - How would you extend the client to support batch query submission and batch status polling?

Quick Answer: This question evaluates API client design, asynchronous query lifecycle management, result retrieval, robust error handling, and secure credential handling for a cloud data warehouse.

Part 1: Simulate an Asynchronous QueryClient

Simulate a lightweight asynchronous query client over a mock warehouse. The client should validate runtime credentials, start queries, poll query status, fetch results, and surface common errors such as invalid SQL, missing credentials, network failures, timeouts, and unknown query IDs. Query IDs must be assigned as q1, q2, q3, ... in start order. For a valid SQL statement, its mock warehouse spec contains a timeline of states returned on successive successful status polls.

Constraints

  • 0 <= len(operations) <= 10^4
  • 1 <= timeout_polls <= 10^6
  • Each valid SQL in mock_warehouse has a non-empty timeline
  • network_fail_on uses 1-based indexing over status/result attempts for that query
  • Credential values are considered missing if the required key is absent or its value is empty

Examples

Input: {'credentials': {'token': 'abc'}, 'warehouse': {'SELECT 1': {'timeline': ['QUEUED', 'RUNNING', 'SUCCEEDED'], 'result': [1]}}, 'actions': [('start', 'SELECT 1'), ('status', 'q1'), ('status', 'q1'), ('status', 'q1'), ('fetch', 'q1')]}

Expected Output: ['q1', 'QUEUED', 'RUNNING', 'SUCCEEDED', [1]]

Explanation: A valid query is started as q1. Three successful status polls walk through the timeline, then fetch returns the stored result.

Input: {'credentials': {}, 'warehouse': {'SELECT 1': {'timeline': ['SUCCEEDED'], 'result': [1]}}, 'actions': [('start', 'SELECT 1'), ('status', 'q1')]}

Expected Output: ['ERROR:MISSING_CREDENTIALS', 'ERROR:MISSING_CREDENTIALS']

Explanation: Credentials are checked before every action, so both actions fail immediately.

Input: {'credentials': {'token': 'abc'}, 'warehouse': {'SELECT 1': {'timeline': ['SUCCEEDED'], 'result': [1]}}, 'actions': [('start', 'BAD SQL'), ('status', 'q1'), ('fetch', 'q9')]}

Expected Output: ['ERROR:INVALID_SQL', 'ERROR:UNKNOWN_QUERY_ID', 'ERROR:UNKNOWN_QUERY_ID']

Explanation: The SQL is not in the warehouse, so no query is created. Later references to q1 and q9 are unknown.

Input: {'credentials': {'token': 'abc'}, 'warehouse': {'SELECT 7': {'timeline': ['QUEUED', 'RUNNING', 'SUCCEEDED'], 'result': [7]}}, 'network_failures': [2, 4], 'actions': [('start', 'SELECT 7'), ('status', 'q1'), ('status', 'q1'), ('status', 'q1'), ('status', 'q1'), ('status', 'q1'), ('fetch', 'q1')]}

Expected Output: ['q1', 'ERROR:NETWORK', 'QUEUED', 'ERROR:NETWORK', 'RUNNING', 'SUCCEEDED', [7]]

Explanation: Network failures do not advance the timeline. Only the successful status polls return QUEUED, RUNNING, and SUCCEEDED.

Input: {'credentials': {'token': 'abc'}, 'warehouse': {'SELECT slow': {'timeline': ['QUEUED', 'RUNNING', 'RUNNING', 'SUCCEEDED'], 'result': [42], 'timeout_after': 2}}, 'actions': [('start', 'SELECT slow'), ('status', 'q1'), ('status', 'q1'), ('status', 'q1'), ('fetch', 'q1')]}

Expected Output: ['q1', 'QUEUED', 'RUNNING', 'ERROR:TIMEOUT', 'ERROR:TIMEOUT']

Explanation: After 2 successful polls without success, the next successful status call times out the query permanently.

Input: {'credentials': {'token': 'abc'}, 'warehouse': {'A': {'timeline': ['SUCCEEDED'], 'result': [1]}, 'B': {'timeline': ['QUEUED', 'SUCCEEDED'], 'result': [2]}}, 'actions': [('start', 'A'), ('start', 'B'), ('status', 'q2'), ('status', 'q1'), ('fetch', 'q1'), ('status', 'q2'), ('fetch', 'q2')]}

Expected Output: ['q1', 'q2', 'QUEUED', 'SUCCEEDED', [1], 'SUCCEEDED', [2]]

Explanation: Query IDs follow successful start order, and each query tracks its own independent timeline.

Input: {'credentials': {'token': 'abc'}, 'warehouse': {'SELECT 5': {'timeline': ['QUEUED', 'SUCCEEDED'], 'result': [5]}}, 'actions': [('start', 'SELECT 5'), ('fetch', 'q1'), ('status', 'q1'), ('fetch', 'q1'), ('status', 'q1'), ('fetch', 'q1')]}

Expected Output: ['q1', 'ERROR:NOT_READY', 'QUEUED', 'ERROR:NOT_READY', 'SUCCEEDED', [5]]

Explanation: Fetching before the query succeeds returns NOT_READY. After SUCCEEDED is observed, fetch returns the result.

Input: {'credentials': {'token': 'abc'}, 'warehouse': {'SELECT 1': {'timeline': ['SUCCEEDED'], 'result': [1]}}, 'actions': []}

Expected Output: []

Explanation: With no actions, the output is an empty list.

Hints

  1. Store per-query state such as current timeline index, number of successful status polls, terminal state, and error reason.
  2. A network failure should not advance the query timeline or poll count.

Part 2: Resolve and Redact Credentials Securely

Build a secure credential resolver for a warehouse client. Each config value can be a literal, an environment-variable reference, or a secret-store reference. Resolve the final config, reject insecure plaintext sensitive values when policy forbids them, detect missing or expired secrets, and produce a redacted copy safe for logs.

Constraints

  • 0 <= len(config) <= 10^4
  • 0 <= len(required_keys) <= 10^4
  • References only use the prefixes 'env:' and 'secret:'
  • If any error exists, resolved must be None
  • safe_log should include all keys from config, and also any missing required keys

Examples

Input: ({'config': {'account': 'acme', 'user': 'alice', 'password': 'SECRET:warehouse_pw'}, 'env': {}, 'secrets': {'warehouse_pw': {'value': 'snowpw', 'expired': False}}, 'sensitive_keys': ['password', 'token'], 'allow_plaintext_sensitive': False},)

Expected Output: {'resolved': {'account': 'acme', 'user': 'alice', 'password': 'snowpw'}, 'safe_log': {'account': 'acme', 'user': 'alice', 'password': '***'}, 'errors': []}

Explanation: account and user are non-sensitive literals, so they are accepted as-is. password resolves from the secret store and is redacted in safe_log.

Input: ({'config': {'account': 'ENV:WH_ACCOUNT', 'user': 'ENV:WH_USER', 'password': 'plainpw', 'token': 'ENV:API_TOKEN'}, 'env': {'WH_ACCOUNT': 'acme', 'WH_USER': 'bob', 'API_TOKEN': 'tkn'}, 'secrets': {}, 'sensitive_keys': ['password', 'token'], 'allow_plaintext_sensitive': False},)

Expected Output: {'resolved': {'account': 'acme', 'user': 'bob', 'token': 'tkn'}, 'safe_log': {'account': 'acme', 'user': 'bob', 'token': '***'}, 'errors': ['insecure plaintext for sensitive key: password']}

Explanation: password is a sensitive literal and plaintext is forbidden, so it is rejected. token comes from env, so it is allowed and then redacted for logs.

Input: ({'config': {'password': 'SECRET:pw1', 'api_key': 'SECRET:key1', 'region': 'us-east'}, 'env': {}, 'secrets': {'pw1': {'value': 'x', 'expired': True}}, 'sensitive_keys': ['password', 'api_key'], 'allow_plaintext_sensitive': False},)

Expected Output: {'resolved': {'region': 'us-east'}, 'safe_log': {'region': 'us-east'}, 'errors': ['expired secret: pw1 for key password', 'missing secret: key1 for key api_key']}

Explanation: pw1 exists but is expired, and key1 does not exist. Only region resolves successfully.

Input: ({'config': {'user': 'ENV:USER', 'password': 'ENV:PW'}, 'env': {'USER': 'carol'}, 'secrets': {}, 'sensitive_keys': ['password'], 'allow_plaintext_sensitive': False},)

Expected Output: {'resolved': {'user': 'carol'}, 'safe_log': {'user': 'carol'}, 'errors': ['missing env: PW for key password']}

Explanation: USER exists in env, but PW does not. The unresolved password key is omitted from both outputs.

Input: ({'config': {}, 'env': {'A': '1'}, 'secrets': {'s': {'value': 'v', 'expired': False}}, 'sensitive_keys': ['password'], 'allow_plaintext_sensitive': False},)

Expected Output: {'resolved': {}, 'safe_log': {}, 'errors': []}

Explanation: An empty config should produce empty resolved and safe_log dictionaries with no errors.

Input: ({'config': {'token': 'raw-token', 'warehouse': 'acme'}, 'env': {}, 'secrets': {}, 'sensitive_keys': ['token'], 'allow_plaintext_sensitive': True},)

Expected Output: {'resolved': {'token': 'raw-token', 'warehouse': 'acme'}, 'safe_log': {'token': '***', 'warehouse': 'acme'}, 'errors': []}

Explanation: Because plaintext sensitive values are allowed by policy, token is accepted, but it must still be redacted in safe_log.

Hints

  1. Write a small helper to detect whether a key is sensitive based on its name.
  2. You can resolve values, build safe_log, and collect errors in a single pass over config, then separately check for missing required keys.

Part 3: Batch Query Submission and Batch Status Polling

Simulate an extension of the query client that supports batch query submission and batch status polling. Each submit operation creates a new batch and assigns query IDs. Each poll operation advances only the referenced batch by one cycle, respecting a max_concurrent limit. Queued queries start in original order. A query with steps == 0 finishes immediately when it gets a chance to start.

Constraints

  • 1 <= max_concurrent <= 10^5
  • Total number of submitted queries across all batches <= 10^5
  • 0 <= steps <= 10^6
  • Polling advances only the batch named in that operation
  • Batch IDs are b1, b2, ... and query IDs are q1, q2, ...

Examples

Input: (1, [('submit', [{'steps': 0, 'finalStatus': 'FAILED'}, {'steps': 1, 'finalStatus': 'SUCCEEDED'}, {'steps': 0, 'finalStatus': 'SUCCEEDED'}]), ('poll', 'b1'), ('poll', 'b1'), ('poll', 'b1')])

Expected Output: [{'batchId': 'b1', 'queryIds': ['q1', 'q2', 'q3']}, {'QUEUED': 1, 'RUNNING': 1, 'SUCCEEDED': 0, 'FAILED': 1, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 2, 'FAILED': 1, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 2, 'FAILED': 1, 'CANCELED': 0}]

Explanation: First poll: q1 fails immediately, q2 starts running, q3 remains queued because max_concurrent is 1. Second poll: q2 finishes successfully, then q3 starts and succeeds immediately. Third poll changes nothing.

Input: (2, [('submit', [{'steps': 2, 'finalStatus': 'SUCCEEDED'}, {'steps': 0, 'finalStatus': 'FAILED'}, {'steps': 1, 'finalStatus': 'CANCELED'}]), ('submit', [{'steps': 1, 'finalStatus': 'SUCCEEDED'}]), ('poll', 'b1'), ('poll', 'b2'), ('poll', 'b1'), ('poll', 'b2'), ('poll', 'b1')])

Expected Output: [{'batchId': 'b1', 'queryIds': ['q1', 'q2', 'q3']}, {'batchId': 'b2', 'queryIds': ['q4']}, {'QUEUED': 0, 'RUNNING': 2, 'SUCCEEDED': 0, 'FAILED': 1, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 1, 'SUCCEEDED': 0, 'FAILED': 0, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 1, 'SUCCEEDED': 0, 'FAILED': 1, 'CANCELED': 1}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 1, 'FAILED': 0, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 1, 'FAILED': 1, 'CANCELED': 1}]

Explanation: Each batch advances only when that specific batch is polled. Batch b1 and batch b2 do not affect each other.

Input: (3, [('submit', []), ('poll', 'b1'), ('poll', 'b1')])

Expected Output: [{'batchId': 'b1', 'queryIds': []}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 0, 'FAILED': 0, 'CANCELED': 0}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 0, 'FAILED': 0, 'CANCELED': 0}]

Explanation: An empty batch is valid. Polling it always returns zero counts in every state.

Input: (2, [('submit', [{'steps': 0, 'finalStatus': 'SUCCEEDED'}, {'steps': 0, 'finalStatus': 'FAILED'}, {'steps': 1, 'finalStatus': 'SUCCEEDED'}, {'steps': 0, 'finalStatus': 'CANCELED'}]), ('poll', 'b1'), ('poll', 'b1')])

Expected Output: [{'batchId': 'b1', 'queryIds': ['q1', 'q2', 'q3', 'q4']}, {'QUEUED': 0, 'RUNNING': 1, 'SUCCEEDED': 1, 'FAILED': 1, 'CANCELED': 1}, {'QUEUED': 0, 'RUNNING': 0, 'SUCCEEDED': 2, 'FAILED': 1, 'CANCELED': 1}]

Explanation: A single poll can start multiple zero-step queries in order. They finish immediately and may allow later queued queries to start in that same poll.

Hints

  1. For each batch, track the next queued query index, the set of currently running queries, and a count of each state.
  2. If you update counts incrementally, each poll can avoid recounting the entire batch.
Last updated: May 15, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Implement Prefix Match Filter - Datadog (hard)
  • Build span trees from unordered trace spans - Datadog (medium)
  • Implement buffered file writer with concurrency support - Datadog (easy)
  • Implement write with internal buffer - Datadog (Medium)
  • Implement log storage and querying - Datadog (Medium)