Implement staircase printing and distributed mode/median
Company: Anthropic
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Onsite
Quick Answer: This set of problems evaluates implementation correctness and attention to formatting for a staircase-printing routine alongside distributed-algorithms reasoning for computing a global mode/median, testing algorithmic thinking, I/O handling, synchronization, and communication-cost trade-offs in the Coding & Algorithms domain.
Part 1: Staircase Pattern with No Trailing Spaces
Constraints
- 1 <= n <= 1000
- Each output row must use exactly one space between adjacent stars
- No trailing spaces are allowed
Examples
Input: 1
Expected Output: ['*']
Explanation: With one row, the staircase contains only a single star.
Input: 4
Expected Output: ['*', '* *', '* * *', '* * * *']
Explanation: Each row adds one more star, and stars are separated by exactly one space.
Input: 0
Expected Output: []
Explanation: A staircase with zero rows is an empty list.
Input: 2
Expected Output: ['*', '* *']
Explanation: The first row has one star and the second row has two stars separated by one space.
Solution
def solution(n):
result = []
for i in range(1, n + 1):
result.append(' '.join(['*'] * i))
return resultTime complexity: O(n^2). Space complexity: O(n^2).
Hints
- Build each row independently, then join the row items with ' ' so you never create a trailing space.
- The i-th row should contain exactly i copies of the same symbol.
Part 2: Consecutive-Integer Staircase with No Trailing Spaces
Constraints
- 1 <= n <= 1000
- Numbers start at 1 and increase by 1 each time
- No trailing spaces are allowed
Examples
Input: 1
Expected Output: ['1']
Explanation: There is only one row, so it contains just the first number.
Input: 4
Expected Output: ['1', '2 3', '4 5 6', '7 8 9 10']
Explanation: The numbers continue consecutively across rows: 1, then 2 3, then 4 5 6, then 7 8 9 10.
Input: 0
Expected Output: []
Explanation: With zero rows, the staircase is empty.
Input: 5
Expected Output: ['1', '2 3', '4 5 6', '7 8 9 10', '11 12 13 14 15']
Explanation: Each row has one more number than the previous row, and numbering continues without resetting.
Solution
def solution(n):
if n <= 0:
return []
result = []
current = 1
for row_length in range(1, n + 1):
row = []
for _ in range(row_length):
row.append(str(current))
current += 1
result.append(" ".join(row))
return resultTime complexity: O(n^2). Space complexity: O(n^2).
Hints
- Keep a running number that starts at 1 and increases every time you place a value in the staircase.
- Build each row as a list of strings, then use ' '.join(...) to format it cleanly.
Part 3: Distributed Mode and Median with Communication-Aware Strategy Selection
Constraints
- 1 <= len(partitions) <= 10
- 0 <= total number of integers <= 200000
- Each integer fits in 32-bit signed range
- If value_range is provided, it should be a tuple (lo, hi) with lo <= hi
Examples
Input: ([[1, 1, 1, 1, 2, 2], [2, 2, 2, 3, 3, 3], [1, 1, 3, 3, 3, 3]], (1, 3))
Expected Output: {'mode': 3, 'median': 2, 'strategy': 'dense_histogram', 'sent_bytes': 36, 'read_seconds': 2.4, 'network_seconds': 36, 'total_seconds': 38.4}
Explanation: Small known range makes a dense histogram cheaper than sending raw values or sparse maps.
Input: ([[1000, 1000, 1000, 1000, 1000, 1000, 1001], [1001, 1001, 1001, 1001, 1001, 1001, 1002], [1002, 1002, 1002, 1002, 1002, 1002]], None)
Expected Output: {'mode': 1001, 'median': 1001, 'strategy': 'sparse_counts', 'sent_bytes': 40, 'read_seconds': 2.8, 'network_seconds': 40, 'total_seconds': 42.8}
Explanation: The values are large and repeated, so sparse local counts are much cheaper than shipping all raw integers.
Input: ([[1, 2, 3], [4, 5], [6]], None)
Expected Output: {'mode': 1, 'median': 3.5, 'strategy': 'raw', 'sent_bytes': 24, 'read_seconds': 1.2, 'network_seconds': 24, 'total_seconds': 25.2}
Explanation: All values are unique, so sparse summaries are more expensive than just sending the raw integers.
Input: ([[], [], []], None)
Expected Output: {'mode': None, 'median': None, 'strategy': 'sparse_counts', 'sent_bytes': 0, 'read_seconds': 0.0, 'network_seconds': 0, 'total_seconds': 0.0}
Explanation: Edge case: no worker has any data.
Solution
from collections import Counter\n\ndef solution(partitions, value_range=None):\n def median_from_counts(counts):\n total = sum(counts.values())\n if total == 0:\n return None\n\n left_pos = (total + 1) // 2\n right_pos = (total + 2) // 2\n running = 0\n left_val = None\n right_val = None\n\n for value in sorted(counts):\n running += counts[value]\n if left_val is None and running >= left_pos:\n left_val = value\n if right_val is None and running >= right_pos:\n right_val = value\n break\n\n if left_val == right_val:\n return left_val\n\n avg = (left_val + right_val) / 2\n if avg.is_integer():\n return int(avg)\n return avg\n\n total_counts = Counter()\n max_read_bytes = 0\n raw_bytes = 0\n sparse_bytes = 0\n\n dense_valid = isinstance(value_range, tuple) and len(value_range) == 2 and value_range[0] <= value_range[1]\n if dense_valid:\n lo, hi = value_range\n range_size = hi - lo + 1\n\n for part in partitions:\n local_counts = Counter(part)\n total_counts.update(local_counts)\n\n local_bytes = len(part) * 4\n raw_bytes += local_bytes\n max_read_bytes = max(max_read_bytes, local_bytes)\n sparse_bytes += len(local_counts) * 8\n\n if dense_valid:\n for value in local_counts:\n if value < lo or value > hi:\n dense_valid = False\n break\n\n candidates = [\n (raw_bytes, 2, 'raw'),\n (sparse_bytes, 1, 'sparse_counts')\n ]\n\n if dense_valid:\n dense_bytes = len(partitions) * range_size * 4\n candidates.append((dense_bytes, 0, 'dense_histogram'))\n\n sent_bytes, _, strategy = min(candidates, key=lambda item: (item[0], item[1]))\n\n if not total_counts:\n mode = None\n median = None\n else:\n max_freq = max(total_counts.values())\n mode = min(value for value, freq in total_counts.items() if freq == max_freq)\n median = median_from_counts(total_counts)\n\n read_seconds = round(max_read_bytes / 10.0, 1)\n network_seconds = sent_bytes\n total_seconds = round(read_seconds + network_seconds, 1)\n\n return {\n 'mode': mode,\n 'median': median,\n 'strategy': strategy,\n 'sent_bytes': sent_bytes,\n 'read_seconds': read_seconds,\n 'network_seconds': network_seconds,\n 'total_seconds': total_seconds\n }Time complexity: O(N + U log U), where N is the total number of integers and U is the number of distinct values globally. Space complexity: O(U).
Hints
- A global frequency map is enough to compute both the exact mode and the exact median, because the median only needs counts in sorted value order.
- Compare total bytes for all valid strategies before choosing one; the cheapest communication plan is not always the same.