PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates skills in data modeling, API design, algorithmic filtering, and implementation of an in-memory query execution engine, including membership predicates, general filter logic, and handling edge cases.

  • hard
  • Valon
  • Coding & Algorithms
  • Software Engineer

Implement an In-Memory Query Engine

Company: Valon

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Technical Screen

Implement a small in-memory query engine for tabular data. You are given one or more tables. Each table has: - A table name - A list of column names - Rows stored entirely in memory, where each row maps column names to values Implement APIs that support the following features incrementally: 1. **Basic select** - Return selected columns from a table. - Example: selecting `name` and `age` from a `users` table should return only those fields for every row. 2. **Select with membership filtering** - Add support for an `IN` predicate on a column. - Example: return rows where `name` is in `{"Alice", "Bob"}`. 3. **Select with general filtering** - Add support for additional filter predicates, such as equality comparisons or caller-provided predicate functions. - Example: return selected columns from rows where `age > 30` and `city == "Seattle"`. Design clean data structures and implement the query execution logic. The solution should be written as production-quality code, with attention to edge cases such as missing tables, missing columns, empty result sets, and invalid filters.

Quick Answer: This question evaluates skills in data modeling, API design, algorithmic filtering, and implementation of an in-memory query execution engine, including membership predicates, general filter logic, and handling edge cases.

Part 1: Basic Select from an In-Memory Table

Implement the first feature of a tiny in-memory query engine: selecting specific columns from a single table. A table is stored in memory as a schema plus a list of row dictionaries. Given all tables, a table name, and a list of columns to project, return only those fields for every row in the table. Preserve the original row order. If the table does not exist or any requested column is not in the table schema, return an error object instead of rows.

Constraints

  • 1 <= number of tables <= 100
  • 0 <= number of rows in a table <= 10^4
  • Each table's 'columns' list contains unique names
  • Row values are primitive Python values such as int, str, bool, or None
  • If any selected column is not present in the table schema, the query is invalid

Examples

Input: ({'users': {'columns': ['id', 'name', 'age'], 'rows': [{'id': 1, 'name': 'Alice', 'age': 30}, {'id': 2, 'name': 'Bob', 'age': 25}]}}, 'users', ['name', 'age'])

Expected Output: {'rows': [{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]}

Explanation: Basic projection: keep only the name and age fields from every row.

Input: ({'users': {'columns': ['id', 'name'], 'rows': []}}, 'users', ['name'])

Expected Output: {'rows': []}

Explanation: Edge case: selecting from an empty table should return an empty result set.

Input: ({}, 'orders', ['id'])

Expected Output: {'error': "table 'orders' not found"}

Explanation: If the table does not exist, return an error.

Input: ({'users': {'columns': ['id', 'name'], 'rows': [{'id': 1, 'name': 'Alice'}]}}, 'users', ['name', 'email'])

Expected Output: {'error': "unknown columns: ['email']"}

Explanation: Selecting a column not present in the schema is invalid.

Hints

  1. Validate the table name and all requested columns before scanning rows.
  2. Build a brand-new dictionary for each row so the output contains only the requested fields.

Part 2: Select with Membership Filtering

Extend the in-memory query engine to support a single membership filter. Given a table name, a list of columns to project, a filter column, and a collection of allowed values, return the projected rows whose filter-column value is in the allowed collection. Preserve input row order. If the table or any referenced column does not exist, or if the allowed-values argument is not a valid collection, return an error object.

Constraints

  • 1 <= number of tables <= 100
  • 0 <= number of rows in a table <= 10^4
  • 0 <= number of allowed values <= 10^4
  • The filter column and all selected columns must exist in the table schema
  • Membership checking should be efficient for large allowed-value collections

Examples

Input: ({'users': {'columns': ['id', 'name', 'age'], 'rows': [{'id': 1, 'name': 'Alice', 'age': 30}, {'id': 2, 'name': 'Bob', 'age': 25}, {'id': 3, 'name': 'Carol', 'age': 40}]}}, 'users', ['id', 'name'], 'name', ['Alice', 'Bob'])

Expected Output: {'rows': [{'id': 1, 'name': 'Alice'}, {'id': 2, 'name': 'Bob'}]}

Explanation: Only rows whose name is Alice or Bob should be returned.

Input: ({'users': {'columns': ['id', 'name'], 'rows': [{'id': 1, 'name': 'Alice'}]}}, 'users', ['name'], 'name', [])

Expected Output: {'rows': []}

Explanation: Edge case: an empty allowed-value collection means no row can match.

Input: ({'users': {'columns': ['id', 'name'], 'rows': [{'id': 1, 'name': 'Alice'}]}}, 'users', ['name'], 'city', ['Seattle'])

Expected Output: {'error': "unknown filter column: 'city'"}

Explanation: The filter column must exist in the table schema.

Input: ({}, 'users', ['name'], 'name', ['Alice'])

Expected Output: {'error': "table 'users' not found"}

Explanation: Missing table should return an error.

Hints

  1. Convert the allowed-values collection into a set before scanning rows.
  2. Filter rows first, then project only the requested columns for the rows that match.

Part 3: Select with General Filtering

Implement a more flexible in-memory query engine that supports multiple filters combined with AND. Each filter is usually a descriptor dictionary of the form {'column': ..., 'op': ..., 'value': ...}, where op is one of '==', '!=', '>', '>=', '<', '<=', or 'in'. Return the selected columns from rows that satisfy every filter. For Python callers, you may additionally support callable row predicates, but the official test cases use only descriptor dictionaries so the inputs remain valid Python literals. Return an error object for missing tables, missing columns, malformed filters, or unsupported operators.

Constraints

  • 1 <= number of tables <= 100
  • 0 <= number of rows in a table <= 10^4
  • 0 <= number of filters <= 20
  • Supported operators are '==', '!=', '>', '>=', '<', '<=', and 'in'
  • For valid ordered comparisons, the filter value type is compatible with the column values being compared

Examples

Input: ({'users': {'columns': ['id', 'name', 'age', 'city'], 'rows': [{'id': 1, 'name': 'Alice', 'age': 31, 'city': 'Seattle'}, {'id': 2, 'name': 'Bob', 'age': 28, 'city': 'Seattle'}, {'id': 3, 'name': 'Carol', 'age': 35, 'city': 'Portland'}, {'id': 4, 'name': 'Dave', 'age': 40, 'city': 'Seattle'}]}}, 'users', ['name'], [{'column': 'age', 'op': '>', 'value': 30}, {'column': 'city', 'op': '==', 'value': 'Seattle'}])

Expected Output: {'rows': [{'name': 'Alice'}, {'name': 'Dave'}]}

Explanation: Both filters must pass, so only users older than 30 and living in Seattle remain.

Input: ({'users': {'columns': ['id', 'name', 'age', 'city'], 'rows': [{'id': 1, 'name': 'Alice', 'age': 31, 'city': 'Seattle'}, {'id': 2, 'name': 'Bob', 'age': 28, 'city': 'Seattle'}, {'id': 3, 'name': 'Carol', 'age': 35, 'city': 'Portland'}, {'id': 4, 'name': 'Dave', 'age': 40, 'city': 'Seattle'}]}}, 'users', ['id'], [])

Expected Output: {'rows': [{'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}

Explanation: Edge case: no filters means all rows are returned after projection.

Input: ({'users': {'columns': ['id', 'name', 'age'], 'rows': [{'id': 1, 'name': 'Alice', 'age': 31}, {'id': 2, 'name': 'Bob', 'age': 28}]}}, 'users', ['name'], [{'column': 'age', 'op': '>', 'value': 100}])

Expected Output: {'rows': []}

Explanation: If no row satisfies the filters, return an empty result set.

Input: ({'users': {'columns': ['id', 'name'], 'rows': [{'id': 1, 'name': 'Alice'}]}}, 'users', ['name'], [{'column': 'country', 'op': '==', 'value': 'US'}])

Expected Output: {'error': "unknown filter column: 'country'"}

Explanation: Every filter column must exist in the table schema.

Input: ({'users': {'columns': ['id', 'name'], 'rows': [{'id': 1, 'name': 'Alice'}]}}, 'users', ['name'], [{'column': 'name', 'op': 'contains', 'value': 'A'}])

Expected Output: {'error': "unsupported operator: 'contains'"}

Explanation: Unsupported filter operators should be rejected.

Hints

  1. Normalize and validate all filters before scanning the rows.
  2. Evaluate filters with short-circuit logic: as soon as one filter fails for a row, skip the remaining filters.
Last updated: Jun 6, 2026

Related Coding Questions

  • Design an In-Memory Database - Valon (medium)

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.