Design Tic-Tac-Toe and QPS data structures
Company: Databricks
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates data-structure and API design skills, covering state management and efficient update/query algorithms for a generalized Tic-Tac-Toe with uniform-random move selection, as well as time-series aggregation and streaming-log techniques for computing QPS from non-decreasing timestamps.
Part 1: Generalized Tic-Tac-Toe Move Engine
Constraints
- 1 <= rows, cols <= 1000
- 1 <= k <= max(rows, cols)
- 0 <= len(moves) <= rows * cols
- Each player value is intended to be 1 or 2, but invalid values should return -1 for that move
Examples
Input: (3, 3, 3, [(0, 0, 1), (1, 0, 2), (0, 1, 1), (1, 1, 2), (0, 2, 1)])
Expected Output: [0, 0, 0, 0, 1]
Explanation: Player 1 wins horizontally on the top row with the last move.
Input: (3, 4, 3, [(0, 1, 2), (0, 0, 1), (1, 1, 1), (2, 2, 1)])
Expected Output: [0, 0, 0, 1]
Explanation: Player 1 forms a main diagonal of length 3.
Input: (2, 2, 2, [(0, 0, 1), (0, 0, 2), (2, 1, 1)])
Expected Output: [0, -1, -1]
Explanation: The second move uses an occupied cell, and the third is out of bounds.
Input: (2, 3, 1, [(1, 2, 2), (0, 0, 1)])
Expected Output: [2, -1]
Explanation: With k = 1, the first valid move wins immediately, so later moves are invalid.
Solution
def solution(rows, cols, k, moves):
board = {}
results = []
game_over = False
directions = ((1, 0), (0, 1), (1, 1), (1, -1))
for row, col, player in moves:
if game_over or player not in (1, 2) or not (0 <= row < rows and 0 <= col < cols) or (row, col) in board:
results.append(-1)
continue
board[(row, col)] = player
won = False
for dr, dc in directions:
count = 1
r, c = row + dr, col + dc
while count < k and 0 <= r < rows and 0 <= c < cols and board.get((r, c)) == player:
count += 1
r += dr
c += dc
r, c = row - dr, col - dc
while count < k and 0 <= r < rows and 0 <= c < cols and board.get((r, c)) == player:
count += 1
r -= dr
c -= dc
if count >= k:
won = True
break
if won:
game_over = True
results.append(player)
else:
results.append(0)
return results
Time complexity: O(M * k), where M is the number of moves. Space complexity: O(M).
Hints
- Only the most recent move can create a new win, so you never need to rescan the whole board.
- For each move, check four direction pairs: horizontal, vertical, main diagonal, and anti-diagonal.
Part 2: Random Legal Move Selector for Generalized Tic-Tac-Toe
Constraints
- 1 <= rows, cols
- 1 <= rows * cols <= 200000
- 1 <= k <= max(rows, cols)
- 0 <= len(operations) <= 200000
Examples
Input: (2, 2, 2, [('ai', 1, 2), ('move', 0, 1, 1), ('ai', 2, 1)])
Expected Output: [(1, 0), 0, (1, 0)]
Explanation: Initially the 3rd empty cell in row-major order is `(1, 0)`. After occupying `(0, 1)`, the 2nd empty cell is still `(1, 0)`.
Input: (1, 3, 1, [('ai', 1, 1), ('move', 0, 2, 2), ('ai', 1, 0), ('move', 0, 1, 1)])
Expected Output: [(0, 1), 2, (-1, -1), -1]
Explanation: The move by player 2 wins immediately because `k = 1`, so later AI and move operations are invalid.
Input: (1, 2, 2, [('move', 0, 0, 1), ('move', 0, 0, 2), ('move', 0, 1, 2), ('ai', 1, 0)])
Expected Output: [0, -1, 0, (-1, -1)]
Explanation: The board becomes full after the third operation, so the AI has no legal move.
Input: (2, 3, 3, [('move', 0, 0, 1), ('ai', 2, 5), ('ai', 2, 4)])
Expected Output: [0, (-1, -1), (1, 2)]
Explanation: After one move there are 5 empty cells, so rank 5 is invalid but rank 4 points to the last empty cell in row-major order.
Solution
def solution(rows, cols, k, operations):
class Fenwick:
def __init__(self, n):
self.n = n
self.bit = [0] * (n + 1)
for i in range(1, n + 1):
self.bit[i] = i & -i
def add(self, idx, delta):
while idx <= self.n:
self.bit[idx] += delta
idx += idx & -idx
def kth(self, k):
idx = 0
bit_mask = 1 << self.n.bit_length()
while bit_mask:
nxt = idx + bit_mask
if nxt <= self.n and self.bit[nxt] < k:
idx = nxt
k -= self.bit[nxt]
bit_mask >>= 1
return idx + 1
def check_win(row, col, player):
for dr, dc in directions:
count = 1
r, c = row + dr, col + dc
while count < k and 0 <= r < rows and 0 <= c < cols and board.get((r, c)) == player:
count += 1
r += dr
c += dc
r, c = row - dr, col - dc
while count < k and 0 <= r < rows and 0 <= c < cols and board.get((r, c)) == player:
count += 1
r -= dr
c -= dc
if count >= k:
return True
return False
n = rows * cols
fenwick = Fenwick(n)
total_empty = n
board = {}
directions = ((1, 0), (0, 1), (1, 1), (1, -1))
game_over = False
results = []
for op in operations:
if op[0] == 'move':
_, row, col, player = op
if game_over or player not in (1, 2) or not (0 <= row < rows and 0 <= col < cols) or (row, col) in board:
results.append(-1)
continue
board[(row, col)] = player
fenwick.add(row * cols + col + 1, -1)
total_empty -= 1
if check_win(row, col, player):
game_over = True
results.append(player)
else:
results.append(0)
elif op[0] == 'ai':
_, player, rank = op
if game_over or player not in (1, 2) or total_empty == 0 or not (0 <= rank < total_empty):
results.append((-1, -1))
continue
idx = fenwick.kth(rank + 1) - 1
results.append((idx // cols, idx % cols))
else:
results.append(-1)
return results
Time complexity: Each 'move' is O(k + log(rows * cols)); each 'ai' query is O(log(rows * cols)). Space complexity: O(rows * cols + P), where P is the number of played cells.
Hints
- If `rank` tells you which empty cell to choose, the AI query becomes a k-th available-position query.
- A Fenwick tree or segment tree can track which cells are still empty in row-major order.
Part 3: Online Exact QPS Tracker
Constraints
- Up to 10^6 operations
- Record timestamps are non-decreasing
- Timestamps fit in a signed 64-bit integer
Examples
Input: ([('record', 1), ('record', 1), ('record', 3), ('getQPS', 1, 3), ('getQPS', 2, 2)])
Expected Output: [1.0, 0.0]
Explanation: There are 3 requests in `[1, 3]` over 3 seconds, and none in `[2, 2]`.
Input: ([('record', 10), ('record', 13), ('record', 13), ('getQPS', 10, 13), ('getQPS', 11, 12)])
Expected Output: [0.75, 0.0]
Explanation: The first query sees 3 requests over 4 seconds.
Input: ([('getQPS', 5, 7), ('record', 6), ('getQPS', 6, 6)])
Expected Output: [0.0, 1.0]
Explanation: The first query is on an empty log. After recording one request at time 6, QPS over `[6, 6]` is 1.0.
Input: ([('record', 2), ('getQPS', 5, 3), ('getQPS', 0, 2)])
Expected Output: [0.0, 0.3333333333333333]
Explanation: An invalid time range returns 0.0. The second query has 1 request over 3 seconds.
Solution
def solution(operations):
from bisect import bisect_left, bisect_right
times = []
prefix = []
results = []
for op in operations:
if op[0] == 'record':
t = op[1]
if times and times[-1] == t:
prefix[-1] += 1
else:
times.append(t)
prefix.append((prefix[-1] if prefix else 0) + 1)
else:
_, start, end = op
if start > end or not times:
results.append(0.0)
continue
right = bisect_right(times, end) - 1
if right < 0:
results.append(0.0)
continue
left = bisect_left(times, start) - 1
count = prefix[right]
if left >= 0:
count -= prefix[left]
results.append(count / (end - start + 1))
return results
Time complexity: record: O(1) amortized, getQPS: O(log U), where U is the number of distinct recorded timestamps. Space complexity: O(U).
Hints
- Many consecutive `record` calls may share the same timestamp, so storing one count per distinct time can save space.
- Use binary search on the distinct timestamps to count how many requests fall inside a query range.
Part 4: Fast Arbitrary-Range QPS Queries from Sorted Logs
Constraints
- 0 <= len(timestamps) <= 10^6
- 0 <= len(queries) <= 10^6
- Timestamps are sorted in non-decreasing order
- Timestamps fit in a signed 64-bit integer
Examples
Input: ([1, 1, 2, 4, 10], [(1, 2), (3, 10)])
Expected Output: [1.5, 0.25]
Explanation: The first query counts 3 requests over 2 seconds; the second counts 2 requests over 8 seconds.
Input: ([], [(0, 0), (5, 7)])
Expected Output: [0.0, 0.0]
Explanation: An empty log yields zero QPS for every query.
Input: ([5, 5, 5], [(5, 5), (4, 6), (6, 10)])
Expected Output: [3.0, 1.0, 0.0]
Explanation: All requests happen at time 5.
Input: ([-3, -1, 0, 2], [(-2, 0), (-5, -4), (3, 1)])
Expected Output: [0.6666666666666666, 0.0, 0.0]
Explanation: The first range contains 2 requests over 3 seconds; the last query is an invalid interval.
Solution
def solution(timestamps, queries):
from bisect import bisect_left, bisect_right
results = []
for start, end in queries:
if start > end or not timestamps:
results.append(0.0)
continue
left = bisect_left(timestamps, start)
right = bisect_right(timestamps, end)
results.append((right - left) / (end - start + 1))
return results
Time complexity: O(Q log N), where N is the number of timestamps. Space complexity: O(1) extra space, excluding the output list.
Hints
- In a sorted array, the number of values inside a range can be found with two binary searches.
- You only need the first index >= startTime and the first index > endTime.
Part 5: Approximate QPS with Fixed-Width Time Buckets
Constraints
- 1 <= bucket_size <= 10^9
- 0 <= len(timestamps) <= 200000
- 0 <= len(queries) <= 200000
- Timestamps fit in a signed 64-bit integer
Examples
Input: (10, [0, 1, 9, 10, 11, 19], [(0, 9), (5, 14)])
Expected Output: [0.3, 0.3]
Explanation: Each 10-second bucket contains 3 requests. The second query uses 5 seconds from each of two buckets.
Input: (5, [], [(0, 4), (3, 2)])
Expected Output: [0.0, 0.0]
Explanation: No timestamps means zero approximate QPS, and an invalid interval also returns 0.0.
Input: (4, [0, 0, 0, 3, 4, 7], [(1, 4)])
Expected Output: [0.875]
Explanation: Bucket `[0, 3]` contributes 3/4 of its 4 requests and bucket `[4, 7]` contributes 1/4 of its 2 requests.
Input: (3, [-3, -2, 0, 2, 3], [(-3, -1), (0, 5)])
Expected Output: [0.6666666666666666, 0.5]
Explanation: The first query is entirely inside one bucket; the second spans two full buckets.
Solution
def solution(bucket_size, timestamps, queries):
from bisect import bisect_left, bisect_right
bucket_counts = {}
for t in timestamps:
b = t // bucket_size
bucket_counts[b] = bucket_counts.get(b, 0) + 1
bucket_ids = sorted(bucket_counts)
prefix = []
running = 0
for b in bucket_ids:
running += bucket_counts[b]
prefix.append(running)
def bucket_sum(lo, hi):
if lo > hi or not bucket_ids:
return 0
left = bisect_left(bucket_ids, lo)
right = bisect_right(bucket_ids, hi) - 1
if left > right:
return 0
total = prefix[right]
if left > 0:
total -= prefix[left - 1]
return total
def overlap_len(bucket_id, start, end):
bucket_start = bucket_id * bucket_size
bucket_end = bucket_start + bucket_size - 1
left = max(start, bucket_start)
right = min(end, bucket_end)
if left > right:
return 0
return right - left + 1
results = []
for start, end in queries:
if start > end:
results.append(0.0)
continue
duration = end - start + 1
start_bucket = start // bucket_size
end_bucket = end // bucket_size
if start_bucket == end_bucket:
approx_count = bucket_counts.get(start_bucket, 0) * (duration / bucket_size)
else:
approx_count = 0.0
approx_count += bucket_counts.get(start_bucket, 0) * (overlap_len(start_bucket, start, end) / bucket_size)
approx_count += bucket_counts.get(end_bucket, 0) * (overlap_len(end_bucket, start, end) / bucket_size)
approx_count += bucket_sum(start_bucket + 1, end_bucket - 1)
results.append(approx_count / duration)
return results
Time complexity: Preprocessing: O(N + U log U); each query: O(log U), where U is the number of non-empty buckets. Space complexity: O(U).
Hints
- Map each timestamp `t` to bucket `t // bucket_size` and count how many records fall in each bucket.
- For a partial bucket, multiply that bucket's count by `overlap_length / bucket_size`.