Implement chance of a personal best
Company: SoFi
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Technical Screen
You are given a simplified model for tracking race times on an obstacle course.
- A **course** has a fixed number of obstacles.
- A **run** records the time spent on each obstacle in order.
- A run may be **complete** or **incomplete**.
- A **run collection** stores many runs for the same course.
For an obstacle index `i`, the historical times available for that obstacle come from **all other runs that reached obstacle `i`**, including incomplete runs.
Implement the method `chance_of_personal_best(run)` in `RunCollection`.
### Required behavior
Given an **in-progress** run:
1. Perform **10,000 simulation trials**.
2. In each trial, for every obstacle not yet completed in the current run, randomly choose one historical time for that obstacle from the other runs in the collection that have a recorded time for that obstacle.
3. Combine the current run's existing obstacle times with the sampled remaining times.
4. Check whether the simulated total time is **less than or equal to** the current **personal best**.
5. Return the fraction of trials that are a personal best (or tie the personal best).
### Important details
- The current personal best is the **minimum total time among completed runs only**.
- When sampling future obstacle times, include times from incomplete historical runs **if they reached that obstacle**.
- Do **not** sample from the in-progress run being evaluated.
- The returned value should be a floating-point probability, and with 10,000 trials it should usually be within **0.02** of the true probability.
### Example
Suppose a course has 3 obstacles and the historical runs are:
- `[3, 3, 2]`
- `[3, 3, 3]`
The current in-progress run is `[3, 3]`.
The personal best among completed runs is `8`.
For the last obstacle, the historical times are `{2, 3}`.
So the simulated completed run is either `[3,3,2]` or `[3,3,3]`, each with equal probability.
Therefore the chance of matching or beating the personal best is about `0.5`.
### Starter structure
Assume classes similar to the following already exist:
- `Course(title, obstacle_count)`
- `Run(course)` with fields:
- `course`
- `complete`
- `obstacle_times`
- `add_obstacle_time(t)`
- `get_run_time()`
- `RunCollection(course)` with field:
- `runs`
Write the `chance_of_personal_best` method for `RunCollection`.
Quick Answer: This question evaluates a candidate's ability to perform simulation-based probabilistic estimation, sample and aggregate historical per-obstacle times, and correctly handle incomplete run records within a collection.
You are tracking race times on an obstacle course.
- A **course** has a fixed number of obstacles (`obstacle_count`).
- A **run** records the time spent on each obstacle, in order. A run may be **complete** (all obstacles recorded) or **incomplete** (in progress / aborted).
- A **run collection** stores many runs for the same course. Historical times for obstacle `i` come from **every run that reached obstacle `i`**, including incomplete runs.
You are given an **in-progress** run and must compute the probability that it ends up being a *new personal best* or *ties* the current personal best, under this model: for every obstacle the in-progress run has not yet completed, its time is drawn **uniformly at random** from the times recorded for that obstacle across all other runs in the collection that reached it. The original framing runs 10,000 simulation trials; here you should return the **exact** probability that simulation converges to.
**Inputs to `solution`:**
- `obstacle_data`: a list of runs, each a list of ints (the obstacle times for that run, in order). A run is complete iff its length equals `obstacle_count`.
- `current_times`: a list of ints — the obstacle times the in-progress run has recorded so far (a prefix; not present in `obstacle_data`).
- `obstacle_count`: int, the number of obstacles on the course.
**Rules:**
1. The **personal best** is the minimum total time among **completed runs only**.
2. For each obstacle index `i` that the in-progress run has NOT completed (i.e. `i >= len(current_times)`), the pool of candidate times is `[run[i] for run in obstacle_data if len(run) > i]`.
3. Every combination of one sampled time per remaining obstacle is equally likely. Return the fraction of combinations for which `sum(current_times) + sum(sampled) <= personal_best`.
4. Return a float in `[0.0, 1.0]`.
**Example:** course of 3 obstacles, history `[[3,3,2],[3,3,3]]`, in-progress run `[3,3]`. Personal best = 8. The last obstacle's pool is `{2,3}`, so the completed run is `[3,3,2]` (=8, beats/ties) or `[3,3,3]` (=9, no), each with probability 1/2 -> answer `0.5`.
Constraints
- 1 <= obstacle_count
- 0 <= len(current_times) <= obstacle_count (the in-progress run is a prefix)
- At least one run in obstacle_data is complete (length == obstacle_count), so the personal best is defined.
- Every remaining obstacle index is reached by at least one historical run (otherwise the probability is 0).
- All obstacle times are positive integers.
Examples
Input: ([[3, 3, 2], [3, 3, 3]], [3, 3], 3)
Expected Output: 0.5
Explanation: Personal best among completed runs is 8 ([3,3,2]). The in-progress run [3,3] has obstacle 2 left; its pool is {2,3}. Total is 8 (<=8, counts) or 9 (no), each equally likely -> 0.5.
Input: ([[3, 3, 2, 3], [3, 3, 3, 2], [5, 5, 2]], [3, 3], 4)
Expected Output: 0.8333333333333334
Explanation: Completed runs total 11 and 11, so personal best = 11. Two obstacles remain. Obstacle 2 pool = [2,3,2] (the incomplete run [5,5,2] reached it), obstacle 3 pool = [3,2]. Of the 6 equally-likely (time2,time3) pairs, all give 6 + (time2+time3) <= 11 except (3,3) -> 12; 5/6 = 0.8333.
Input: ([[3, 3, 2], [3, 3, 3]], [3, 3, 2], 3)
Expected Output: 1.0
Explanation: The in-progress run is already complete with total 8, which ties the personal best of 8. No obstacles remain, so the single (empty) combination satisfies <= 8 -> probability 1.0.
Input: ([[10, 1]], [], 2)
Expected Output: 1.0
Explanation: Only one completed run [10,1] (total 11), so personal best = 11. The in-progress run has recorded nothing, so both obstacles are sampled; the only candidates are 10 and 1, giving 11 <= 11 with certainty -> 1.0.
Input: ([[5, 5, 5], [4, 4, 4]], [9], 3)
Expected Output: 0.0
Explanation: Personal best = min(15, 12) = 12. The in-progress run has fixed time 9 with two obstacles left; each remaining pool is {5,4}, so the minimum possible remaining sum is 4+4=8, giving total >= 17 > 12. No combination can tie or beat 12 -> 0.0.
Hints
- The personal best is the minimum total over COMPLETED runs only (length == obstacle_count); incomplete runs never set the personal best, but their recorded obstacle times ARE valid samples for the obstacles they reached.
- For obstacle index i, the candidate sample times are exactly run[i] for every run whose length is greater than i.
- Instead of 10,000 random trials, compute the exact probability: every combination of one candidate per remaining obstacle is equally likely, so the answer is (# combinations with current_fixed_sum + sampled_sum <= personal_best) / (total # combinations).
- Full enumeration is pool_size ^ remaining_obstacles in the worst case. Use a DP/convolution: keep a map from achievable remaining-sum to how many combinations produce it, and extend it one obstacle at a time.