Solve three algorithmic OA problems
Company: DRW
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Take-home Project
This is an online assessment (Codility-style, 150 minutes) consisting of three **independent** coding problems. Solve each one separately; they do not share input or state.
---
### Part 1 — Knockout Tournament Match Counts
There are $n$ players standing in a line, indexed $1$ to $n$. Player $i$ has a distinct skill value $a[i]$, and the values $a[1..n]$ form a permutation of the integers $1$ to $n$ (so all skills are distinct).
A series of knockout rounds is played:
1. In each round, the surviving players are paired from left to right: positions $1$ vs $2$, $3$ vs $4$, and so on.
2. In each pair, the player with the **higher** skill value wins and advances; the other is eliminated and leaves the tournament.
3. After a round, the winners keep their relative left-to-right order and form the player sequence for the next round.
4. Rounds continue until exactly one player (the overall champion) remains.
Define $b[1..n]$ where $b[i]$ is the **total number of matches the original player $i$ participates in** (until they are eliminated, or until they win the tournament).
**Task:** Given $n$ and the skill array $a[1..n]$, return the array $b[1..n]$.
```hint Reframe the structure
Because $n = 2^k$, the bracket is a perfect binary tree of depth $k$. A player who survives to round $r$ (1-indexed) has played exactly $r$ matches; they get one more match each round until they lose.
```
```hint A clean way to count
You do not need to know *who* wins each round to count matches — every still-alive player plays exactly one match per round. So $b[i] = (\text{number of rounds player } i \text{ survives into})$. Simulate round by round: increment $b$ for every alive player, then keep the per-pair winner. This is $O(n \log n)$ total work, since the alive set halves each round.
```
#### What This Part Should Cover
- Recognizing that match count equals the number of rounds the player remains alive (one match per round per alive player).
- A correct round-by-round simulation (or equivalent divide-and-conquer / segment recursion) that preserves left-to-right winner ordering.
- Complexity argument: total work across all rounds is $\sum_k n/2^k = O(n)$ comparisons plus $O(n \log n)$ for the increments — comfortably within $n \le 200{,}000$.
- Correct handling of the champion (they play $k = \log_2 n$ matches, the maximum).
---
### Part 2 — Robot Path Planning with Movement Instructions
You are given an $R \times C$ grid ($3 \le R, C \le 50$). Each cell is one of:
- `#` — wall (cannot be entered),
- `.` — empty floor (can be entered),
- `*` — the robot's starting cell (also a traversable floor cell).
Guarantees: the outer border (first/last row, first/last column) is entirely walls; all floor cells form a single connected component; the robot starts on a floor cell.
You control the robot with characters `^` (up), `v` (down), `<` (left), `>` (right). Every move must keep the robot inside the grid and on a floor cell (`.` or `*`). You do **not** need to return to the start.
Construct an instruction string for each of the five variants below. For every variant the output length must be at most $100{,}000$. Any valid sequence is accepted — shortest-path optimality is **not** required. Assume each described shape admits such a sequence within the bound.
```hint One algorithm covers most of it
Variants 2–5 all ask for *full coverage* of an arbitrary (or special-case) connected region. A single technique solves all of them: a DFS over the floor cells, emitting the move character when you step into a new cell and the inverse move (`^`↔`v`, `<`↔`>`) when you backtrack. This produces an Euler tour of a spanning tree.
```
```hint Why the length bound is safe
A DFS-backtracking walk traverses each spanning-tree edge exactly twice, so its length is $2 \cdot (V-1)$ where $V$ is the number of floor cells. With border walls, $V \le 48 \times 48 = 2304$, giving at most $\approx 4606$ moves — far under $100{,}000$. Subtask 1 (border-only walk) is the one case the generic DFS does *not* solve, since it forbids interior cells; handle it separately by walking the rectangle's perimeter explicitly.
```
The variants:
1. **Subtask 1 — Rectangular border walk.** All floor cells form a solid axis-aligned rectangle. The start is at the rectangle's **top-left corner**. Visit **exactly** the border (perimeter) cells, all of them, and never step on an interior floor cell.
2. **Subtask 2 — Full rectangular coverage.** Floor cells form a solid rectangle; start anywhere inside; visit **every** floor cell at least once.
3. **Subtask 3 — Dumbbell.** Two disjoint rectangles connected by a single floor cell (a 1-cell corridor); start anywhere; visit **every** floor cell.
4. **Subtask 4 — Simple path.** The floor cells form a simple path in the orthogonal-adjacency graph (no branches, no cycles); start anywhere on it; visit **every** cell.
5. **Subtask 5 — Arbitrary connected shape.** Any connected floor region (per the global guarantees); start anywhere; visit **every** floor cell.
**Task (all subtasks):** Output one valid instruction string of `^v<>` that keeps the robot on floor cells and meets the variant's visitation requirement, with length $\le 100{,}000$. You may treat each subtask as its own test case with its own grid.
#### What This Part Should Cover
- A single coverage routine (DFS Euler tour with inverse-move backtracking) handling subtasks 2–5 uniformly, plus the realization that subtask 1 needs its own perimeter walk because interior cells are forbidden.
- A correct, explicit length argument tying $2(V-1)$ to the $100{,}000$ bound.
- Move/inverse-move bookkeeping and a `visited` set so the robot never re-enters as a "new" cell and never steps onto a wall or out of bounds.
- Handling degenerate shapes (single floor cell ⇒ empty string; a 1-wide rectangle in subtask 1 where the "perimeter" is the whole region).
---
### Part 3 — Maximum Sum of Two Numbers Without Shared Digits
You are given an array of $n$ integers $a[1..n]$ ($1 \le n \le 200{,}000$). Assume each $a[i]$ is a non-negative integer (state a reasonable upper bound, e.g. up to $10^9$, if your method needs one).
Two numbers have **no common digit** if, written in base-10 without leading zeros, their digit sets are disjoint. For example $12$ and $340$ are allowed; $15$ and $51$ are not (they share `1` and `5`).
**Task:** Choose two distinct indices $i \ne j$ such that $a[i]$ and $a[j]$ share no digit and $a[i] + a[j]$ is maximized. Output that maximum sum. If no valid pair exists, return a sentinel value (e.g. $-1$) and document it.
```hint Compress each number to a 10-bit signature
Each number's relevant information is just *which of the digits 0–9 it contains* — a 10-bit mask. There are only $2^{10} = 1024$ possible masks. Reduce the array to: for each mask, the **maximum** value having exactly that digit set.
```
```hint Pair masks by disjointness
Two numbers are compatible iff their masks AND to $0$. Naively pair all surviving masks ($\le 1024^2 \approx 10^6$ pairs) and take the best disjoint pair — or, faster, use a Sum-Over-Subsets (SOS) DP so that for each mask $m$ you instantly look up the best value whose mask is a subset of the complement $\lnot m$. Note no number has an empty mask (even `0` contains digit `0`), so a number is never disjoint with itself — the $i \ne j$ requirement is automatic across distinct masks.
```
#### What This Part Should Cover
- The 10-bit digit-mask reduction and keeping only the max value per mask ($\le 1024$ survivors) — turning a quadratic-in-$n$ search into one over a constant 1024-mask space.
- A correct disjointness test (`mask1 & mask2 == 0`) and a correct argument that the distinct-index constraint is satisfied automatically.
- Either the $O(1024^2)$ mask-pair scan or the $O(10 \cdot 2^{10})$ SOS/complement approach, with overall complexity $O(n \log(\max a) + 2^{10} \cdot 10)$.
- Defined, documented behavior when no compatible pair exists.
---
### What a Strong Answer Covers
Across all three parts, a strong submission demonstrates these cross-cutting qualities:
- **Reduction to the right invariant.** Each problem rewards spotting a structural shortcut (matches = rounds survived; coverage = spanning-tree Euler tour; digit set = 10-bit mask) rather than brute force.
- **Complexity that respects the limits.** $n \le 200{,}000$ rules out $O(n^2)$ in Parts 1 and 3; the $\le 100{,}000$ move budget in Part 2 must be justified, not assumed.
- **Edge-case discipline.** $n = 1$ (Parts 1 and 3 — Part 1 has a lone champion with 0 matches; Part 3 has no pair), single-cell or 1-wide regions (Part 2), and the "no valid pair" sentinel (Part 3).
- **Clean, testable implementations** that match the worked examples in the prompt.
### Follow-up Questions
1. **Part 1:** Instead of $b[i]$, return for each player the skill value of the opponent who eliminated them (and a sentinel for the champion). How does your simulation change?
2. **Part 2:** For subtask 5, can you guarantee a covering walk of length **strictly less** than $2(V-1)$ in general, or is that bound tight? What grid shape forces it to be tight?
3. **Part 3:** Generalize to selecting $k$ numbers (not just two) that are pairwise digit-disjoint with maximum total sum. How does the mask-DP change, and what is the complexity?
4. **Part 3:** If the "no common digit" rule were replaced by "share at most one common digit," how would you adapt the mask approach?
Quick Answer: This set evaluates algorithmic problem-solving skills including array manipulation and simulation for knockout tournament dynamics, graph traversal and constructive movement-sequence design on grids, and broader competencies in handling permutations, connectivity, and constraint-driven constructions.
Solution
## Overview
Three independent problems. Each has a structural shortcut that turns an apparently expensive task into a linear or near-linear one:
| Part | Key insight | Complexity |
|------|-------------|-----------|
| 1 — Tournament | matches played = rounds survived; every alive player plays once per round | $O(n \log n)$ time, $O(n)$ space |
| 2 — Robot coverage | full coverage = DFS Euler tour of a spanning tree (each edge twice) | $O(V)$ time, walk length $\le 2(V-1)$ |
| 3 — Disjoint-digit sum | each number → 10-bit digit mask; only 1024 masks matter | $O(n \log A + 2^{10}\cdot 10)$ |
---
## Part 1 — Knockout Tournament Match Counts
### Insight
Number of matches a player plays = number of **rounds they survive into**. In each round, *every* player still alive plays exactly one match. So we never need a per-player formula: walk round by round, and for each alive player add 1 to its counter, then keep the per-pair winner. A player stops accumulating the moment they lose; the champion accumulates $\log_2 n$ matches (the most possible).
Because the alive set halves every round, total increments are $n + n/2 + n/4 + \dots = O(n)$, and there are $\log_2 n$ rounds, so the work is $O(n \log n)$ — trivially fine for $n \le 200{,}000$.
### Algorithm
1. `alive` = list of original indices `0..n-1` in order.
2. While `len(alive) > 1`:
- For every `idx` in `alive`: `b[idx] += 1` (they all play this round).
- Build `next` by taking, from each adjacent pair, the index with the larger skill, preserving order.
- `alive = next`.
3. Return `b`. (If $n = 1$ the loop never runs and `b = [0]` — a lone player plays no match.)
### Reference implementation (Python)
```python
def tournament_matches(a):
n = len(a)
b = [0] * n
alive = list(range(n))
while len(alive) > 1:
for idx in alive:
b[idx] += 1
nxt = []
for k in range(0, len(alive), 2):
i, j = alive[k], alive[k + 1]
nxt.append(i if a[i] > a[j] else j)
alive = nxt
return b
```
### Worked example (from the prompt)
```
a = [5, 1, 3, 2, 4, 6, 8, 7] (n = 8, k = 3)
b = [3, 1, 2, 1, 1, 2, 3, 1]
```
Trace: round 1 every player gets +1 → all become 1; winners of (5,1),(3,2),(4,6),(8,7) are indices 0,2,5,6 (skills 5,3,6,8). Round 2 those four get +1 → b becomes 2 for them; winners are 0 and 6 (5 beats 3, 8 beats 6). Round 3 those two get +1 → b = 3 for indices 0 and 6. Player 6 (skill 8) is champion with 3 matches; player 0 (skill 5) lost the final, also 3 matches. Matches the expected `[3,1,2,1,1,2,3,1]`.
### Alternative: divide-and-conquer
Equivalently, recurse on the bracket. `solve(segment)` returns the winning index of that segment; every index in the segment gets `+1` (it played a match at this level), then recurse on the left and right halves and have their two sub-winners play. Same $O(n \log n)$; the iterative round simulation is simpler and cache-friendlier and is preferred here.
### Edge cases
- $n = 1$: champion plays 0 matches (`b = [0]`).
- The guarantee that $n$ is a power of two means pairing is always exact — no byes.
- Skills are a permutation, so ties never occur; the `>` test is unambiguous.
---
## Part 2 — Robot Path Planning
### The unifying idea: DFS Euler tour
Subtasks 2–5 all demand visiting **every** floor cell of a connected region. One routine covers all of them: do a depth-first search over floor cells starting at `*`. When you descend into an unvisited neighbor, emit that direction's character; when the recursion returns, emit the **inverse** move to walk back (`^`↔`v`, `<`↔`>`). The emitted string is an Euler tour of the DFS spanning tree.
**Length bound.** Every spanning-tree edge is traversed exactly twice (once down, once back), so the walk length is $2 \cdot (V - 1)$ where $V$ is the number of floor cells. With the mandatory wall border, $V \le 48 \times 48 = 2304$, so the walk is at most $\approx 4606$ moves — two orders of magnitude under the $100{,}000$ cap. The robot never steps onto a wall or out of bounds because it only ever moves into a known floor neighbor (and back along the same edge).
Subtask 1 is the **only** one this does not solve directly: it forbids interior cells, so we walk the rectangle perimeter explicitly instead.
### Generic coverage routine (subtasks 2, 3, 4, 5)
```python
DIRS = [(-1, 0, '^'), (1, 0, 'v'), (0, -1, '<'), (0, 1, '>')]
INV = {'^': 'v', 'v': '^', '<': '>', '>': '<'}
def cover_all(grid):
floor, start = set(), None
for r, row in enumerate(grid):
for c, ch in enumerate(row):
if ch in '.*':
floor.add((r, c))
if ch == '*':
start = (r, c)
visited = {start}
out = []
# Iterative DFS to avoid recursion-depth limits (V can be ~2300).
stack = [(start, iter(DIRS))]
while stack:
(r, c), it = stack[-1]
advanced = False
for dr, dc, mv in it:
nr, nc = r + dr, c + dc
if (nr, nc) in floor and (nr, nc) not in visited:
visited.add((nr, nc))
out.append(mv)
stack.append(((nr, nc), iter(DIRS)))
advanced = True
break
if not advanced:
stack.pop()
if stack: # backtrack to parent
pr, pc = stack[-1][0]
# inverse of the move that brought us from parent to (r, c)
out.append(INV[_dir_char(pr, pc, r, c)])
return ''.join(out)
def _dir_char(fr, fc, tr, tc):
for dr, dc, mv in DIRS:
if fr + dr == tr and fc + dc == tc:
return mv
```
(The recursive form is shorter and equally correct for $V \le 2304$, but the iterative stack is safest given default recursion limits.)
This single function satisfies:
- **Subtask 2** (solid rectangle, any start) — full coverage, trivially.
- **Subtask 3** (dumbbell) — the single connecting cell is just a tree edge; DFS crosses it, covers the far rectangle, and backtracks.
- **Subtask 4** (simple path) — the spanning tree *is* the path; the walk goes to one end, backtracks past start to the other end. (If you want a shorter path here you could instead walk to one endpoint then sweep to the other, length $V-1 + (\text{offset})$, but the generic tour already fits the bound.)
- **Subtask 5** (arbitrary connected) — exactly what DFS Euler tour is for.
### Subtask 1 — rectangle perimeter walk
Interior cells are forbidden, so we cannot use `cover_all`. The start is the top-left corner of an $h \times w$ floor rectangle. Walk the border clockwise:
```python
def border_walk(h, w):
# start at top-left corner; visit only perimeter cells.
if h == 1:
return '>' * (w - 1) # single row: just sweep right
if w == 1:
return 'v' * (h - 1) # single column: just sweep down
out = []
out.append('>' * (w - 1)) # top edge, left -> right
out.append('v' * (h - 1)) # right edge, top -> bottom
out.append('<' * (w - 1)) # bottom edge, right -> left
out.append('^' * (h - 2)) # left edge back up (stop one short of start)
return ''.join(out)
```
This visits every perimeter cell exactly once and never enters the interior. Length $= 2(w-1) + 2(h-1) - 1 \le 2(50+50) \approx 200$, well within bounds. Degenerate $1 \times w$ / $h \times 1$ rectangles (whose "perimeter" is the whole strip) are handled by the early returns.
### Correctness and edge cases
- **Never leaves the floor:** every emitted move corresponds to a real floor-to-floor edge.
- **Single floor cell:** `cover_all` returns the empty string (start already visited) — valid.
- **Validation harness:** simulate the output string from the start, asserting each position stays on a floor cell, and confirm the set of visited cells equals the required target set (all floor cells for 2–5; exactly the perimeter for subtask 1). On the prompt's dumbbell and path examples this yields walks of length 40 and 22 respectively, both covering all cells.
---
## Part 3 — Maximum Sum of Two Numbers Without Shared Digits
### Insight: collapse each number to a 10-bit digit mask
The only thing that matters about a number for the disjointness test is **which digits 0–9 it contains** — a 10-bit mask. There are only $2^{10} = 1024$ possible masks, so after reduction the problem size is bounded by a constant regardless of $n$.
Two numbers are compatible iff `mask1 & mask2 == 0`. For each distinct mask keep only the **maximum** value with that mask (a larger value can only help maximize the sum). Note **no number has an empty mask** — even the value `0` contains the digit `0` — so a number is never disjoint with another copy of its own mask, and the $i \ne j$ requirement is satisfied automatically whenever the two masks are disjoint.
### Two implementations
**(a) Simple $O(1024^2)$ pair scan** — clear and fast enough:
```python
def max_disjoint_pair(a):
best = [-1] * 1024
for x in a:
m = digit_mask(x)
if x > best[m]:
best[m] = x
masks = [m for m in range(1024) if best[m] >= 0]
ans = -1
for i in range(len(masks)):
for j in range(i + 1, len(masks)):
if masks[i] & masks[j] == 0:
ans = max(ans, best[masks[i]] + best[masks[j]])
return ans
def digit_mask(x):
if x == 0:
return 1 # digit '0'
m = 0
while x:
m |= 1 << (x % 10)
x //= 10
return m
```
**(b) $O(10 \cdot 2^{10})$ Sum-Over-Subsets (SOS) DP** — for each mask, precompute the best value over all *subsets* of that mask; then for each present mask $m$, the best compatible partner has a mask that is a subset of the complement $\lnot m$:
```python
def max_disjoint_pair_sos(a):
FULL = (1 << 10) - 1
best = [-1] * 1024
for x in a:
m = digit_mask(x)
if x > best[m]:
best[m] = x
sub = best[:] # sub[mask] = max value over masks ⊆ mask
for bit in range(10):
for mask in range(1024):
if mask & (1 << bit):
lo = sub[mask ^ (1 << bit)]
if lo > sub[mask]:
sub[mask] = lo
ans = -1
for m in range(1024):
if best[m] < 0:
continue
other = sub[FULL ^ m] # best value disjoint from m
if other >= 0:
ans = max(ans, best[m] + other)
return ans
```
Both return identical results; (b) is asymptotically tighter but (a) is already trivial in practice.
### Complexity
Building `best` is $O(n \cdot D)$ where $D = O(\log_{10} A)$ digits per number; the pairing is $O(1024^2)$ (a) or $O(10 \cdot 2^{10})$ (b). Total dominated by the $O(n)$ scan over the input.
### Worked examples
| Input | Output | Why |
|-------|--------|-----|
| `[12, 340]` | `352` | masks `{1,2}`, `{3,4,0}` disjoint → $12+340$ |
| `[15, 51]` | `-1` | both mask `{1,5}` → share digits, no valid pair |
| `[1, 2, 3]` | `5` | `2`+`3` disjoint and largest |
| `[9, 99, 999]` | `-1` | all contain digit `9` |
| `[90, 12, 8]` | `102` | `90` (`{9,0}`) + `12` (`{1,2}`) disjoint → $102$ |
### Edge cases
- $n = 1$: no pair → return $-1$.
- All numbers pairwise share a digit → $-1$.
- Returning a sentinel (`-1`) for "no valid pair" is explicitly documented, as the prompt requires.
---
## Answers to the Follow-up Questions
1. **Part 1 — who eliminated whom.** Keep the round simulation, but when player `i` beats player `j` in a pair, record `eliminator_skill[j] = a[i]` (or the eliminator's original index). The champion never loses, so set their entry to a sentinel (e.g. `None`/`-1`). Same $O(n \log n)$, just one extra write per losing player.
2. **Part 2 — is $2(V-1)$ tight?** It is tight for *tree-shaped* regions. A region whose floor graph is a tree (e.g. the simple-path subtask, or any shape with no cycle) has exactly $V-1$ edges, and any covering walk that returns to traverse every branch must cross some edges twice; the Euler-tour bound $2(V-1)$ is essentially optimal there (you save only by ending at a far leaf instead of returning to start, i.e. $2(V-1) - \text{depth}$). Regions with cycles can be covered in fewer than $2(V-1)$ moves because a cycle lets you avoid re-traversing one edge per independent cycle. So no, you cannot guarantee strictly less than $2(V-1)$ in general — a path/tree forces it.
3. **Part 3 — choose $k$ pairwise-disjoint numbers.** Replace the 2-number pairing with a subset DP over masks: `dp[mask]` = max total sum achievable using numbers whose **union** of digits is exactly the subset `mask`, drawing at most one value per digit-mask. Process masks in increasing order; for each, try adding the best single value whose mask is a subset of the unused digits. For a fixed $k$, augment the state with the count (`dp[mask][t]` = best sum using `t` numbers covering digit set `mask`). The mask space is still $2^{10}$, so this is $O(2^{10} \cdot 1024 \cdot k)$ in the dense form, or $O(3^{10})$ via subset-sum enumeration — constant in $n$ after the initial reduction.
4. **Part 3 — allow at most one shared digit.** Disjointness `mask1 & mask2 == 0` becomes `popcount(mask1 & mask2) <= 1`. The constant-size mask reduction still applies, but the SOS shortcut no longer fits cleanly (the allowed set is no longer "subsets of a complement"). Fall back to the $O(1024^2)$ mask-pair scan with the relaxed test `bin(m1 & m2).count('1') <= 1`, which remains a constant $\approx 10^6$ operations.