Implement staircase printing and distributed mode/median
Company: Anthropic
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Onsite
Quick Answer: This set of problems evaluates implementation correctness and attention to formatting for a staircase-printing routine alongside distributed-algorithms reasoning for computing a global mode/median, testing algorithmic thinking, I/O handling, synchronization, and communication-cost trade-offs in the Coding & Algorithms domain.
Part 1: Staircase Pattern with No Trailing Spaces
Constraints
- 1 <= n <= 1000
- Each output row must use exactly one space between adjacent stars
- No trailing spaces are allowed
Examples
Input: 1
Expected Output: ['*']
Explanation: With one row, the staircase contains only a single star.
Input: 4
Expected Output: ['*', '* *', '* * *', '* * * *']
Explanation: Each row adds one more star, and stars are separated by exactly one space.
Input: 0
Expected Output: []
Explanation: A staircase with zero rows is an empty list.
Input: 2
Expected Output: ['*', '* *']
Explanation: The first row has one star and the second row has two stars separated by one space.
Solution
def solution(n):
result = []
for i in range(1, n + 1):
result.append(' '.join(['*'] * i))
return resultTime complexity: O(n^2). Space complexity: O(n^2).
Hints
- Build each row independently, then join the row items with ' ' so you never create a trailing space.
- The i-th row should contain exactly i copies of the same symbol.
Part 2: Consecutive-Integer Staircase with No Trailing Spaces
Constraints
- 1 <= n <= 1000
- Numbers start at 1 and increase by 1 each time
- No trailing spaces are allowed
Examples
Input: 1
Expected Output: ['1']
Explanation: There is only one row, so it contains just the first number.
Input: 4
Expected Output: ['1', '2 3', '4 5 6', '7 8 9 10']
Explanation: The numbers continue consecutively across rows: 1, then 2 3, then 4 5 6, then 7 8 9 10.
Input: 0
Expected Output: []
Explanation: With zero rows, the staircase is empty.
Input: 5
Expected Output: ['1', '2 3', '4 5 6', '7 8 9 10', '11 12 13 14 15']
Explanation: Each row has one more number than the previous row, and numbering continues without resetting.
Solution
def solution(n):
if n <= 0:
return []
result = []
current = 1
for row_length in range(1, n + 1):
row = []
for _ in range(row_length):
row.append(str(current))
current += 1
result.append(" ".join(row))
return resultTime complexity: O(n^2). Space complexity: O(n^2).
Hints
- Keep a running number that starts at 1 and increases every time you place a value in the staircase.
- Build each row as a list of strings, then use ' '.join(...) to format it cleanly.
Part 3: Distributed Mode and Median with Communication-Aware Strategy Selection
Constraints
- 1 <= len(partitions) <= 10
- 0 <= total number of integers <= 200000
- Each integer fits in 32-bit signed range
- If value_range is provided, it should be a tuple (lo, hi) with lo <= hi
Examples
Input: ([[1, 1, 1, 1, 2, 2], [2, 2, 2, 3, 3, 3], [1, 1, 3, 3, 3, 3]], (1, 3))
Expected Output: {'mode': 3, 'median': 2, 'strategy': 'dense_histogram', 'sent_bytes': 36, 'read_seconds': 2.4, 'network_seconds': 36, 'total_seconds': 38.4}
Explanation: Small known range makes a dense histogram cheaper than sending raw values or sparse maps.
Input: ([[1000, 1000, 1000, 1000, 1000, 1000, 1001], [1001, 1001, 1001, 1001, 1001, 1001, 1002], [1002, 1002, 1002, 1002, 1002, 1002]], None)
Expected Output: {'mode': 1001, 'median': 1001, 'strategy': 'sparse_counts', 'sent_bytes': 40, 'read_seconds': 2.8, 'network_seconds': 40, 'total_seconds': 42.8}
Explanation: The values are large and repeated, so sparse local counts are much cheaper than shipping all raw integers.
Input: ([[1, 2, 3], [4, 5], [6]], None)
Expected Output: {'mode': 1, 'median': 3.5, 'strategy': 'raw', 'sent_bytes': 24, 'read_seconds': 1.2, 'network_seconds': 24, 'total_seconds': 25.2}
Explanation: All values are unique, so sparse summaries are more expensive than just sending the raw integers.
Input: ([[], [], []], None)
Expected Output: {'mode': None, 'median': None, 'strategy': 'sparse_counts', 'sent_bytes': 0, 'read_seconds': 0.0, 'network_seconds': 0, 'total_seconds': 0.0}
Explanation: Edge case: no worker has any data.
Solution
from collections import Counter
def solution(partitions, value_range=None):
def median_from_counts(counts):
total = sum(counts.values())
if total == 0:
return None
left_pos = (total + 1) // 2
right_pos = (total + 2) // 2
running = 0
left_val = None
right_val = None
for value in sorted(counts):
running += counts[value]
if left_val is None and running >= left_pos:
left_val = value
if right_val is None and running >= right_pos:
right_val = value
break
if left_val == right_val:
return left_val
avg = (left_val + right_val) / 2
if avg.is_integer():
return int(avg)
return avg
total_counts = Counter()
max_read_bytes = 0
raw_bytes = 0
sparse_bytes = 0
dense_valid = isinstance(value_range, tuple) and len(value_range) == 2 and value_range[0] <= value_range[1]
if dense_valid:
lo, hi = value_range
range_size = hi - lo + 1
for part in partitions:
local_counts = Counter(part)
total_counts.update(local_counts)
local_bytes = len(part) * 4
raw_bytes += local_bytes
max_read_bytes = max(max_read_bytes, local_bytes)
sparse_bytes += len(local_counts) * 8
if dense_valid:
for value in local_counts:
if value < lo or value > hi:
dense_valid = False
break
candidates = [
(raw_bytes, 2, 'raw'),
(sparse_bytes, 1, 'sparse_counts')
]
if dense_valid:
dense_bytes = len(partitions) * range_size * 4
candidates.append((dense_bytes, 0, 'dense_histogram'))
sent_bytes, _, strategy = min(candidates, key=lambda item: (item[0], item[1]))
if not total_counts:
mode = None
median = None
else:
max_freq = max(total_counts.values())
mode = min(value for value, freq in total_counts.items() if freq == max_freq)
median = median_from_counts(total_counts)
read_seconds = round(max_read_bytes / 10.0, 1)
network_seconds = sent_bytes
total_seconds = round(read_seconds + network_seconds, 1)
return {
'mode': mode,
'median': median,
'strategy': strategy,
'sent_bytes': sent_bytes,
'read_seconds': read_seconds,
'network_seconds': network_seconds,
'total_seconds': total_seconds
}Time complexity: O(N + U log U), where N is the total number of integers and U is the number of distinct values globally. Space complexity: O(U).
Hints
- A global frequency map is enough to compute both the exact mode and the exact median, because the median only needs counts in sorted value order.
- Compare total bytes for all valid strategies before choosing one; the cheapest communication plan is not always the same.