PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates proficiency in designing and implementing mutable in-memory data structures, field-level CRUD and scanning, timestamped operations with TTL semantics, and backup/restore snapshot management, focusing on data structures, temporal reasoning, and state management in the Coding & Algorithms domain.

  • easy
  • xAI
  • Coding & Algorithms
  • Software Engineer

Implement an in-memory database with TTL and backup

Company: xAI

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: easy

Interview Round: Take-home Project

## In-Memory Database (Levels 1–4: TTL and Backup/Restore) Implement an in-memory database that stores **records** identified by a **string key**. Each record contains multiple **string field → string value** pairs. You must support a set of operations that progressively add features. ### Data model - `key` is a string. - Each `key` maps to a set of fields. - Each `field` maps to a `value` (both strings). If an operation refers to a missing `key` or `field`, treat it as absent. > Assumption to make outputs well-defined (typical for OAs): > - `get*` returns `""` (empty string) when absent. > - `delete*` returns `true` if something was deleted, else `false`. > - `scan*` returns an empty list when nothing matches. --- ## Level 1: Basic CRUD on fields Implement: - `set(key, field, value)` - `get(key, field) -> string` - `delete(key, field) -> bool` `set` inserts or overwrites the field’s value. --- ## Level 2: Read-only listing Implement: - `scan(key) -> list[string]` - `scan_by_prefix(key, prefix) -> list[string]` Return format: - Each returned element is formatted as `"field(value)"`. - Results are sorted **lexicographically by `field`**. - `scan_by_prefix` returns only fields whose name starts with `prefix`. --- ## Level 3: Timestamped operations + TTL Add timestamped variants of the above operations. Tests will use **either** timestamped APIs **or** non-timestamped APIs, but **never mix them**. All timestamped operations accept an integer `timestamp`. Implement: - `set_at(key, field, value, timestamp)` - `set_at_with_ttl(key, field, value, timestamp, ttl)` - `get_at(key, field, timestamp) -> string` - `delete_at(key, field, timestamp) -> bool` - `scan_at(key, timestamp) -> list[string]` - `scan_by_prefix_at(key, prefix, timestamp) -> list[string]` TTL semantics: - `set_at_with_ttl` makes the field valid over the half-open interval: - **valid in** `[timestamp, timestamp + ttl)` - Expired fields must **not** appear in `get_at`, `scan_at`, or prefix scans. - Time always moves forward: timestamps provided to operations are **non-decreasing**. --- ## Level 4: Backup and restore Implement: - `backup(timestamp)` - `restore(timestamp, timestamp_to_restore)` Backup requirements: - `backup(t)` stores a snapshot of the database state at time `t`. - For fields with TTL, the backup must capture **remaining TTL** at backup time (i.e., how much lifetime is left at `t`). Restore requirements: - `restore(now, timestamp_to_restore)` restores the database from the **latest** backup whose backup time is **≤ `timestamp_to_restore`**. - After restoring at current time `now`, TTL expiration must be **recalculated** based on remaining TTL stored in the backup: - If a field had remaining TTL `r` in the backup, then after restore at time `now` it should expire at `now + r`. - Fields that were already expired at the moment of backup should not be present in that backup. Your implementation should correctly handle overwrites, deletions, scans, TTL expiry, and backup/restore interactions under the monotonic-time guarantee.

Quick Answer: This question evaluates proficiency in designing and implementing mutable in-memory data structures, field-level CRUD and scanning, timestamped operations with TTL semantics, and backup/restore snapshot management, focusing on data structures, temporal reasoning, and state management in the Coding & Algorithms domain.

Part 1: Basic CRUD on In-Memory Database Fields

You are given a list of database queries for an in-memory store. Each record is identified by a string key, and each key stores string field -> string value pairs. Support three operations: 'SET', 'GET', and 'DELETE'. 'SET' inserts or overwrites a field and produces no output. 'GET' returns the stored value, or '' if the key or field does not exist. 'DELETE' removes a field and returns True if something was deleted, otherwise False. Return the outputs of all 'GET' and 'DELETE' queries in order.

Constraints

  • 0 <= len(queries) <= 100000
  • Each key, field, and value is a string of length 1 to 100
  • Only the operations 'SET', 'GET', and 'DELETE' appear
  • Average O(1) hash-map operations may be assumed

Examples

Input: [['SET', 'user1', 'name', 'alice'], ['GET', 'user1', 'name'], ['SET', 'user1', 'name', 'bob'], ['GET', 'user1', 'name'], ['DELETE', 'user1', 'name'], ['GET', 'user1', 'name'], ['DELETE', 'user1', 'name']]

Expected Output: ['alice', 'bob', True, '', False]

Explanation: The value is overwritten from 'alice' to 'bob'. The first delete succeeds, and the second delete fails because the field is already gone.

Input: [['GET', 'missing', 'x'], ['DELETE', 'missing', 'x']]

Expected Output: ['', False]

Explanation: Missing keys and fields are treated as absent.

Input: [['SET', 'a', 'x', '1'], ['SET', 'a', 'y', '2'], ['DELETE', 'a', 'x'], ['GET', 'a', 'x'], ['GET', 'a', 'y']]

Expected Output: [True, '', '2']

Explanation: Deleting one field does not affect other fields under the same key.

Input: [['SET', 'k', 'f', 'v']]

Expected Output: []

Explanation: A single 'SET' query produces no output.

Solution

def solution(queries):
    db = {}
    out = []

    for q in queries:
        op = q[0]

        if op == 'SET':
            key, field, value = q[1], q[2], q[3]
            db.setdefault(key, {})[field] = value

        elif op == 'GET':
            key, field = q[1], q[2]
            out.append(db.get(key, {}).get(field, ''))

        else:  # DELETE
            key, field = q[1], q[2]
            if key in db and field in db[key]:
                del db[key][field]
                if not db[key]:
                    del db[key]
                out.append(True)
            else:
                out.append(False)

    return out

Time complexity: O(Q) average, where Q is the number of queries.. Space complexity: O(T), where T is the number of stored fields..

Hints

  1. A nested dictionary works well: one map from key to another map of field to value.
  2. After deleting a field, you may remove the key entirely if it has no fields left.

Part 2: Sorted Scan and Prefix Scan in an In-Memory Database

Extend the in-memory database to support listing fields. Each record is identified by a string key and stores string field -> string value pairs. Support 'SET', 'GET', 'DELETE', 'SCAN', and 'SCAN_BY_PREFIX'. 'SCAN' returns all fields for a key formatted as 'field(value)', sorted lexicographically by field name. 'SCAN_BY_PREFIX' does the same, but only for fields whose names start with the given prefix. Missing keys or unmatched prefixes return an empty list. Return the outputs of all queries except 'SET'.

Constraints

  • 0 <= len(queries) <= 100000
  • Each key, field, value, and prefix is a string of length 0 to 100
  • Results for scan operations must be sorted lexicographically by field name
  • Average O(1) hash-map operations may be assumed

Examples

Input: [['SET', 'user', 'b', '2'], ['SET', 'user', 'a', '1'], ['SCAN', 'user'], ['SCAN_BY_PREFIX', 'user', 'a']]

Expected Output: [['a(1)', 'b(2)'], ['a(1)']]

Explanation: Fields are returned in lexicographic order, and the prefix scan keeps only fields starting with 'a'.

Input: [['SET', 'k', 'name', 'Ann'], ['DELETE', 'k', 'age'], ['GET', 'k', 'age'], ['SCAN_BY_PREFIX', 'k', 'z'], ['SCAN', 'missing']]

Expected Output: [False, '', [], []]

Explanation: Deleting a missing field returns False. A missing field returns '', and scans with no matches return empty lists.

Input: [['SET', 'p', 'ab', '1'], ['SET', 'p', 'aa', '2'], ['SET', 'p', 'ab', '3'], ['DELETE', 'p', 'aa'], ['SCAN', 'p'], ['SCAN_BY_PREFIX', 'p', 'a']]

Expected Output: [True, ['ab(3)'], ['ab(3)']]

Explanation: Overwriting 'ab' updates its value, and deleting 'aa' removes it from both scans.

Input: [['SET', 'x', 'field', 'v'], ['SCAN_BY_PREFIX', 'x', 'field']]

Expected Output: [['field(v)']]

Explanation: A single field that matches the prefix is returned in the required format.

Solution

def solution(queries):
    db = {}
    out = []

    def format_scan(key, prefix=None):
        record = db.get(key, {})
        fields = [field for field in record if prefix is None or field.startswith(prefix)]
        fields.sort()
        return [f'{field}({record[field]})' for field in fields]

    for q in queries:
        op = q[0]

        if op == 'SET':
            key, field, value = q[1], q[2], q[3]
            db.setdefault(key, {})[field] = value

        elif op == 'GET':
            key, field = q[1], q[2]
            out.append(db.get(key, {}).get(field, ''))

        elif op == 'DELETE':
            key, field = q[1], q[2]
            if key in db and field in db[key]:
                del db[key][field]
                if not db[key]:
                    del db[key]
                out.append(True)
            else:
                out.append(False)

        elif op == 'SCAN':
            key = q[1]
            out.append(format_scan(key))

        else:  # SCAN_BY_PREFIX
            key, prefix = q[1], q[2]
            out.append(format_scan(key, prefix))

    return out

Time complexity: SET/GET/DELETE are O(1) average. Each SCAN or SCAN_BY_PREFIX is O(F log F), where F is the number of fields under the queried key.. Space complexity: O(T), where T is the number of stored fields..

Hints

  1. Keep the same nested dictionary structure from basic CRUD.
  2. For a scan, collect matching field names first, sort them, then build the 'field(value)' strings.

Part 3: Timestamped In-Memory Database with TTL

Implement a timestamped in-memory database with optional TTL per field. Each key stores string field -> string value pairs. Support only the timestamped operations: 'SET_AT', 'SET_AT_WITH_TTL', 'GET_AT', 'DELETE_AT', 'SCAN_AT', and 'SCAN_BY_PREFIX_AT'. A field set with TTL at time t and ttl d is valid during the half-open interval [t, t + d). Expired fields must behave as absent in gets, deletes, and scans. Timestamps provided to operations are non-decreasing. Return outputs for 'GET_AT', 'DELETE_AT', 'SCAN_AT', and 'SCAN_BY_PREFIX_AT' in order.

Constraints

  • 0 <= len(queries) <= 100000
  • All timestamps are decimal strings representing integers in [0, 10^9]
  • All ttl values are positive decimal strings representing integers in [1, 10^9]
  • The current timestamps in the queries are non-decreasing

Examples

Input: [['SET_AT_WITH_TTL', 'k', 'a', 'x', '10', '5'], ['GET_AT', 'k', 'a', '10'], ['GET_AT', 'k', 'a', '14'], ['GET_AT', 'k', 'a', '15'], ['DELETE_AT', 'k', 'a', '15']]

Expected Output: ['x', 'x', '', False]

Explanation: The field is valid at times 10 through 14, but not at 15 because the TTL interval is [10, 15).

Input: [['SET_AT', 'u', 'name', 'Ann', '1'], ['SET_AT_WITH_TTL', 'u', 'age', '20', '2', '3'], ['SCAN_AT', 'u', '3'], ['SET_AT', 'u', 'age', '21', '4'], ['GET_AT', 'u', 'age', '6'], ['SCAN_BY_PREFIX_AT', 'u', 'a', '6']]

Expected Output: [['age(20)', 'name(Ann)'], '21', ['age(21)']]

Explanation: The TTL version of 'age' is alive at time 3. It is overwritten at time 4 by a non-TTL value, so it still exists at time 6.

Input: [['SET_AT_WITH_TTL', 'x', 'f', '1', '5', '2'], ['DELETE_AT', 'x', 'f', '6'], ['GET_AT', 'x', 'f', '6'], ['SET_AT_WITH_TTL', 'x', 'g', '2', '8', '1'], ['DELETE_AT', 'x', 'g', '9'], ['SCAN_AT', 'x', '9']]

Expected Output: [True, '', False, []]

Explanation: The first delete happens while 'f' is still alive, so it succeeds. The second delete happens exactly at expiry time for 'g', so it fails.

Input: [['SCAN_AT', 'missing', '100'], ['GET_AT', 'missing', 'a', '100']]

Expected Output: [[], '']

Explanation: Scanning or reading a missing key returns an empty list or empty string.

Solution

def solution(queries):
    db = {}
    out = []

    def purge_key(key, t):
        record = db.get(key)
        if record is None:
            return
        dead = [field for field, (_, expire_at) in record.items() if expire_at is not None and t >= expire_at]
        for field in dead:
            del record[field]
        if not record:
            del db[key]

    def scan_list(key, prefix=None):
        record = db.get(key, {})
        fields = [field for field in record if prefix is None or field.startswith(prefix)]
        fields.sort()
        return [f'{field}({record[field][0]})' for field in fields]

    for q in queries:
        op = q[0]

        if op == 'SET_AT':
            key, field, value, t = q[1], q[2], q[3], int(q[4])
            purge_key(key, t)
            db.setdefault(key, {})[field] = [value, None]

        elif op == 'SET_AT_WITH_TTL':
            key, field, value, t, ttl = q[1], q[2], q[3], int(q[4]), int(q[5])
            purge_key(key, t)
            db.setdefault(key, {})[field] = [value, t + ttl]

        elif op == 'GET_AT':
            key, field, t = q[1], q[2], int(q[3])
            purge_key(key, t)
            record = db.get(key, {})
            out.append(record[field][0] if field in record else '')

        elif op == 'DELETE_AT':
            key, field, t = q[1], q[2], int(q[3])
            purge_key(key, t)
            if key in db and field in db[key]:
                del db[key][field]
                if not db[key]:
                    del db[key]
                out.append(True)
            else:
                out.append(False)

        elif op == 'SCAN_AT':
            key, t = q[1], int(q[2])
            purge_key(key, t)
            out.append(scan_list(key))

        else:  # SCAN_BY_PREFIX_AT
            key, prefix, t = q[1], q[2], int(q[3])
            purge_key(key, t)
            out.append(scan_list(key, prefix))

    return out

Time complexity: Point operations are O(1) average plus lazy cleanup on the touched key. Each scan is O(F log F), where F is the number of live fields under the queried key.. Space complexity: O(T), where T is the number of live stored fields..

Hints

  1. Store an absolute expiration time like expire_at = timestamp + ttl instead of storing ttl directly.
  2. Because time never goes backward, you can lazily remove expired fields whenever a key is touched.

Part 4: Backup and Restore in a Timestamped Database with TTL

Implement a timestamped in-memory database with TTL, backups, and restore. Support the same timestamped operations as before plus 'BACKUP' and 'RESTORE'. 'BACKUP' stores a snapshot of the live database state at that time. For TTL fields, the snapshot must store remaining TTL, not the original expiration time. 'RESTORE' chooses the latest backup whose backup time is <= timestamp_to_restore, restores that snapshot at the current time 'now', and recalculates TTL so a field with remaining TTL r expires at now + r. Expired fields must not appear in backups. If no eligible backup exists, restore to an empty database. 'BACKUP' and 'RESTORE' produce no direct output. Return outputs for 'GET_AT', 'DELETE_AT', 'SCAN_AT', and 'SCAN_BY_PREFIX_AT' in order.

Constraints

  • 0 <= len(queries) <= 100000
  • All timestamps, ttl values, and restore arguments are decimal strings representing integers in [0, 10^9]
  • All ttl values are positive
  • The current time arguments for SET/GET/DELETE/SCAN/BACKUP and RESTORE-now are non-decreasing
  • timestamp_to_restore may point to any past time

Examples

Input: [['SET_AT_WITH_TTL', 'doc', 'a', '1', '10', '5'], ['BACKUP', '12'], ['GET_AT', 'doc', 'a', '13'], ['RESTORE', '20', '12'], ['SCAN_AT', 'doc', '22'], ['GET_AT', 'doc', 'a', '23']]

Expected Output: ['1', ['a(1)'], '']

Explanation: At backup time 12, the field has 3 units of TTL left. After restoring at time 20, it is alive at 22 and expires at 23.

Input: [['SET_AT', 'k', 'x', 'A', '1'], ['BACKUP', '2'], ['SET_AT', 'k', 'x', 'B', '3'], ['BACKUP', '4'], ['DELETE_AT', 'k', 'x', '5'], ['RESTORE', '6', '3'], ['GET_AT', 'k', 'x', '6'], ['RESTORE', '7', '100'], ['GET_AT', 'k', 'x', '7']]

Expected Output: [True, 'A', 'B']

Explanation: Restoring to time 3 uses the backup from time 2, so the value is 'A'. Restoring to time 100 uses the later backup from time 4, so the value is 'B'.

Input: [['SET_AT', 'p', 'aa', '1', '1'], ['SET_AT_WITH_TTL', 'p', 'ab', '2', '2', '3'], ['BACKUP', '5'], ['RESTORE', '8', '5'], ['SCAN_BY_PREFIX_AT', 'p', 'a', '8']]

Expected Output: [['aa(1)']]

Explanation: Field 'ab' expires exactly at time 5, so it is not included in the backup taken at time 5.

Input: [['SET_AT', 'u', 'f', 'v', '1'], ['BACKUP', '3'], ['DELETE_AT', 'u', 'f', '4'], ['RESTORE', '10', '2'], ['GET_AT', 'u', 'f', '10'], ['SCAN_AT', 'u', '10']]

Expected Output: [True, '', []]

Explanation: There is no backup with time <= 2, so the restore loads an empty database.

Solution

from bisect import bisect_right

def solution(queries):
    db = {}
    backups = []
    backup_times = []
    out = []

    def purge_key(key, t):
        record = db.get(key)
        if record is None:
            return
        dead = [field for field, (_, expire_at) in record.items() if expire_at is not None and t >= expire_at]
        for field in dead:
            del record[field]
        if not record:
            del db[key]

    def scan_list(key, prefix=None):
        record = db.get(key, {})
        fields = [field for field in record if prefix is None or field.startswith(prefix)]
        fields.sort()
        return [f'{field}({record[field][0]})' for field in fields]

    def build_snapshot(t):
        snapshot = {}
        live_db = {}
        for key, record in db.items():
            snap_record = {}
            live_record = {}
            for field, (value, expire_at) in record.items():
                if expire_at is None or t < expire_at:
                    remaining = None if expire_at is None else expire_at - t
                    snap_record[field] = [value, remaining]
                    live_record[field] = [value, expire_at]
            if snap_record:
                snapshot[key] = snap_record
            if live_record:
                live_db[key] = live_record
        return snapshot, live_db

    for q in queries:
        op = q[0]

        if op == 'SET_AT':
            key, field, value, t = q[1], q[2], q[3], int(q[4])
            purge_key(key, t)
            db.setdefault(key, {})[field] = [value, None]

        elif op == 'SET_AT_WITH_TTL':
            key, field, value, t, ttl = q[1], q[2], q[3], int(q[4]), int(q[5])
            purge_key(key, t)
            db.setdefault(key, {})[field] = [value, t + ttl]

        elif op == 'GET_AT':
            key, field, t = q[1], q[2], int(q[3])
            purge_key(key, t)
            record = db.get(key, {})
            out.append(record[field][0] if field in record else '')

        elif op == 'DELETE_AT':
            key, field, t = q[1], q[2], int(q[3])
            purge_key(key, t)
            if key in db and field in db[key]:
                del db[key][field]
                if not db[key]:
                    del db[key]
                out.append(True)
            else:
                out.append(False)

        elif op == 'SCAN_AT':
            key, t = q[1], int(q[2])
            purge_key(key, t)
            out.append(scan_list(key))

        elif op == 'SCAN_BY_PREFIX_AT':
            key, prefix, t = q[1], q[2], int(q[3])
            purge_key(key, t)
            out.append(scan_list(key, prefix))

        elif op == 'BACKUP':
            t = int(q[1])
            snapshot, db = build_snapshot(t)
            backups.append(snapshot)
            backup_times.append(t)

        else:  # RESTORE
            now, target = int(q[1]), int(q[2])
            idx = bisect_right(backup_times, target) - 1
            if idx < 0:
                db = {}
            else:
                snapshot = backups[idx]
                restored = {}
                for key, record in snapshot.items():
                    restored[key] = {}
                    for field, (value, remaining) in record.items():
                        expire_at = None if remaining is None else now + remaining
                        restored[key][field] = [value, expire_at]
                db = restored

    return out

Time complexity: Point operations are O(1) average plus lazy cleanup on the touched key; each scan is O(F log F); each BACKUP is O(N); each RESTORE is O(L + log K), where F is fields under one key, N is the number of live fields in the whole database, L is the number of fields in the restored snapshot, and K is the number of backups.. Space complexity: O(N + S), where N is the current number of live fields and S is the total number of fields stored across all backups..

Hints

  1. At backup time t, store remaining_ttl = expire_at - t for live TTL fields.
  2. Since backups are taken in time order, you can keep them in a list and use binary search to find the latest backup with time <= timestamp_to_restore.
Last updated: Jun 12, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Flatten and unflatten nested Python structures - xAI (nan)
  • Compute dasher pay from order events - xAI (nan)
  • Compute total active time per Twitter Space - xAI (medium)
  • Design a Recoverable Iterator - xAI (medium)
  • Implement Distributed Matrix Multiplication - xAI (hard)