Build User Journey Path Summary
Company: Whatnot
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Technical Screen
Implement a function `build_summary(logs) -> str` that summarizes user journeys as a trie-like path tree.
You are given a list of action logs. Each log has the form:
`(user_id: int, time: int, action: str)`
Each user may have multiple logs. For each user, sort that user's logs by `time` in ascending order to obtain the user's action sequence. Then build a tree where each root-to-node path represents an action-prefix taken by at least one user. Each node stores the number of distinct users whose journey reached that exact prefix.
Rules:
- Logs may arrive unordered, so sort actions by time per user.
- Each user's journey contributes once to every prefix it traverses.
- Repeated actions are valid and represent distinct steps. For example, `A -> B -> B` is different from `A -> B` and should create a child `B` under the first `A -> B` node.
- Empty input should return an empty string.
- A single user with a single action should return a one-node tree.
- For deterministic output, print children in the order they are first created while inserting user journeys. Process users in ascending `user_id` order after constructing each user's sorted sequence.
- If two logs for the same user have the same timestamp, preserve their original input order.
Output format:
- Return a multiline string.
- Use one line per node.
- Print the action and count as `ACTION (COUNT)`.
- Root-level nodes have no prefix.
- Non-root nodes are prefixed by two spaces per depth level followed by `-> `.
- Print the tree using depth-first preorder traversal.
Example:
Input:
```python
logs = [
(100, 1000, "A"),
(300, 1150, "A"),
(200, 1200, "B"),
(100, 1200, "B"),
(100, 1300, "C"),
(200, 1400, "A"),
(300, 1500, "B"),
(300, 1550, "B"),
(100, 1560, "D"),
]
```
Per-user sequences after sorting by time:
```text
user 100: A -> B -> C -> D
user 200: B -> A
user 300: A -> B -> B
```
Expected output:
```text
A (2)
-> B (2)
-> C (1)
-> D (1)
-> B (1)
B (1)
-> A (1)
```
Additional test cases:
```python
build_summary([])
# ""
build_summary([(1, 100, "X")])
# "X (1)"
build_summary([
(1, 100, "A"),
(1, 200, "B"),
(2, 100, "A"),
(2, 200, "B"),
])
# "A (2)\n -> B (2)"
build_summary([
(1, 300, "C"),
(1, 100, "A"),
(1, 200, "B"),
])
# "A (1)\n -> B (1)\n -> C (1)"
```
Quick Answer: This question evaluates the ability to transform unordered user action logs into a trie-like path tree, exercising skills in grouping and time-sorting per user, maintaining stable insertion order, counting distinct users per prefix, and producing deterministic preorder string output.
Implement `solution(logs)` with the same behavior as `build_summary(logs) -> str`. You are given a list of action logs, where each log is a tuple `(user_id, time, action)`. First, group logs by user and sort each user's logs by ascending `time`. If two logs for the same user have the same timestamp, preserve their original input order. Each user's sorted actions form that user's journey. Then process users in ascending `user_id` order and insert each journey into a trie-like path tree. Every node represents one action in a prefix and stores how many distinct users reached that exact prefix. Repeated actions are valid distinct steps, so `A -> B -> B` is different from `A -> B`. For deterministic output, children must be printed in the order they were first created during insertion. Return the tree as a multiline string using depth-first preorder traversal. Print each node as `ACTION (COUNT)`. Root-level nodes have no prefix. For depth `d > 0`, prefix the line with two spaces repeated `d` times, then `-> `. If `logs` is empty, return an empty string.
Constraints
- 0 <= len(logs) <= 200000
- Each log is a tuple `(user_id, time, action)` where `user_id` and `time` are integers and `action` is a non-empty string
- If two logs belong to the same user and have the same timestamp, their relative order from the input must be preserved
- The number of created trie nodes is at most the total number of logs
Examples
Input: ([(100, 1000, 'A'), (300, 1150, 'A'), (200, 1200, 'B'), (100, 1200, 'B'), (100, 1300, 'C'), (200, 1400, 'A'), (300, 1500, 'B'), (300, 1550, 'B'), (100, 1560, 'D')],)
Expected Output: "A (2)\n -> B (2)\n -> C (1)\n -> D (1)\n -> B (1)\nB (1)\n -> A (1)"
Explanation: User 100 contributes `A -> B -> C -> D`, user 200 contributes `B -> A`, and user 300 contributes `A -> B -> B`. Counts are accumulated per prefix, and children are printed in creation order.
Input: ([],)
Expected Output: ""
Explanation: With no logs, no nodes are created, so the summary is an empty string.
Input: ([(1, 100, 'X')],)
Expected Output: "X (1)"
Explanation: A single user with one action creates a one-node tree.
Input: ([(1, 100, 'A'), (1, 200, 'B'), (2, 100, 'A'), (2, 200, 'B')],)
Expected Output: "A (2)\n -> B (2)"
Explanation: Both users follow the same journey, so the same trie path is shared and each prefix count becomes 2.
Input: ([(1, 300, 'C'), (1, 100, 'A'), (1, 200, 'B')],)
Expected Output: "A (1)\n -> B (1)\n -> C (1)"
Explanation: The user's logs arrive unordered, so they must be sorted by time before insertion.
Input: ([(1, 100, 'B'), (1, 100, 'A')],)
Expected Output: "B (1)\n -> A (1)"
Explanation: The two logs have the same timestamp for the same user, so their original input order is preserved: `B` then `A`.
Input: ([(2, 10, 'A'), (1, 10, 'B')],)
Expected Output: "B (1)\nA (1)"
Explanation: Users are inserted in ascending `user_id` order, so user 1 creates root node `B` before user 2 creates root node `A`, even though user 2 appears first in the input.
Hints
- Store each log's original index before sorting so equal timestamps for the same user stay in input order.
- After building one ordered action list per user, insert those lists into a trie and increment the count along each visited node. An insertion-ordered map/dictionary helps keep child output deterministic.