How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a hard difficulty Coding & Algorithms question, commonly asked during Onsite rounds at OpenAI.

What role is this question designed for?

This question is commonly asked for Machine Learning Engineer candidates at OpenAI during technical interviews.

Schedule Incremental Labeling Tasks | OpenAI Coding Question

Quick Overview

This question evaluates skills in designing stateful incremental schedulers, fairness and load-balancing algorithms, deterministic tie-breaking, and efficient data structures for persistent assignment tracking within the coding and algorithms domain for machine learning engineering roles.

Schedule Incremental Labeling Tasks

Company: OpenAI

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Onsite

You are building a data-labeling platform. Each day, a batch of task IDs arrives. A task ID may reappear on later days because the platform may request additional independent labels for the same underlying item. For every task in the current day, assign exactly one human labeler and one model. Implement an incremental scheduler with persistent state across days, for example: `class Scheduler:` ` def __init__(self, humans: list[str], models: list[str]): ...` ` def schedule(self, tasks: list[str]) -> list[tuple[str, str, str]]: ...` Return one `(task_id, human_id, model_id)` triple per task. Requirements: - The sets of humans and models are fixed at initialization. - Each daily batch contains distinct task IDs. - Each human can handle at most one task per day. - Models have no daily capacity limit. - The same human must never be assigned the same task ID more than once across all days. - Historical assignments remain in effect; `schedule()` must update state incrementally instead of recomputing all previous days from scratch. - The long-run distribution should stay as balanced as possible across humans and across models. To make the problem deterministic, among all valid assignments prefer the one that: 1. minimizes the maximum difference in total assignments between any two humans, 2. then minimizes the maximum difference in total assignments between any two models, 3. then uses lexicographically smaller human IDs and model IDs as tie-breakers. Assume `len(tasks) <= len(humans)` for every day. Discuss the data structures needed to support efficient incremental updates over many days.

Quick Answer: This question evaluates skills in designing stateful incremental schedulers, fairness and load-balancing algorithms, deterministic tie-breaking, and efficient data structures for persistent assignment tracking within the coding and algorithms domain for machine learning engineering roles.

Part 1: Incremental Fair Task Scheduler

You are building a labeling platform. Past assignments are fixed, and today a new batch of task IDs arrives. For each task, you must assign exactly one human labeler and one model. A schedule is valid if: 1. Each human is assigned to at most one task today. 2. A human cannot be assigned to a task they have labeled before in history. 3. Every task in new_tasks is assigned. To keep the system balanced incrementally, choose today human assignments using this rule: - Among all valid schedules, minimize the sum of historical assignment counts of the chosen humans. - If multiple schedules tie, choose the lexicographically smallest sequence of human IDs in task order. After the humans are fixed, assign models independently in task order: - For each task, choose the model with the smallest current usage count, where current means history plus the models already assigned earlier today. - Break model ties lexicographically. Return the list of assignments as tuples (task, human, model) in the same order as new_tasks. If no full schedule is possible, return an empty list.

Constraints

1 <= len(humans) <= 15
0 <= len(new_tasks) <= len(humans)
If len(new_tasks) > 0, then len(models) >= 1
0 <= len(history) <= 5000
Human IDs and model IDs are distinct strings within their own lists

Examples

Input: (['H1', 'H2', 'H3'], ['M1', 'M2'], [('t_old1', 'H1', 'M1'), ('t_old2', 'H1', 'M2'), ('t_old3', 'H2', 'M1')], ['t4', 't5'])

Expected Output: [('t4', 'H2', 'M2'), ('t5', 'H3', 'M1')]

Explanation: Historical human loads are H1=2, H2=1, H3=0, so H2 and H3 are chosen. For models, M2 is used first because it has smaller historical load, then M1 wins the tie.

Input: (['Ann', 'Bob'], ['X'], [('taskA', 'Ann', 'X')], ['taskA'])

Expected Output: [('taskA', 'Bob', 'X')]

Explanation: Ann already labeled taskA in history, so Bob must take it.

Input: (['A', 'B'], ['M1', 'M2'], [('t1', 'A', 'M1'), ('t1', 'B', 'M2')], ['t1'])

Expected Output: []

Explanation: Both humans already labeled t1 before, so no valid assignment exists.

Input: (['A'], ['M'], [], [])

Expected Output: []

Explanation: Edge case: there are no new tasks to schedule.

Input: (['H1'], ['M1'], [], ['t1', 't2'])

Expected Output: []

Explanation: A human can complete at most one task today, so two tasks cannot be assigned with only one human.

Hints

Because each human can be used at most once today, the human-assignment step is a matching-style problem.
Use the historical number of assignments as the cost of choosing a human, then use bitmask DP to find the minimum-cost valid assignment with lexicographic tie-breaking.

Part 2: Monster Team Battle with Revival

You control Team A and want to defeat Team B. Each monster is represented as [hp, attack, revive_hp]. - hp: starting health - attack: damage dealt every round - revive_hp: if greater than 0, then the first time this monster's hp becomes 0 or less, it immediately revives once with revive_hp health and continues fighting Battle rules: 1. Team B fights in the given order. 2. Team A may choose which unused monster to send whenever it needs a new active monster. 3. The two active monsters fight in repeated rounds. 4. Damage is simultaneous each round. 5. If a monster dies and still has its one revive available, it revives immediately after that round. 6. A surviving monster keeps its remaining hp and remaining revive status for the next opponent. Return the maximum number of Team A monsters still alive after all Team B monsters have been eliminated. If Team A cannot eliminate all of Team B, return -1. If the last remaining monsters on both sides die in the same round and Team B is fully eliminated, the result is 0.

Constraints

0 <= len(team_a), len(team_b) <= 8
1 <= hp, attack <= 30 for every monster
0 <= revive_hp <= 30
Team B order is fixed; Team A may choose the next unused monster whenever its current monster is gone

Examples

Input: ([[12, 3], [4, 10]], [[9, 4], [10, 3]])

Expected Output: 1

Explanation: Send [12, 3] first. It ties with the first enemy, then [4, 10] defeats the second and survives. The other order leaves 0 survivors.

Input: ([[3, 1]], [[5, 5]])

Expected Output: -1

Explanation: Your only monster dies in round 1, and the enemy still has hp left.

Input: ([[2, 2], [5, 1]], [])

Expected Output: 2

Explanation: There are no enemies, so all Team A monsters are still alive.

Input: ([[5, 5]], [[5, 5]])

Expected Output: 0

Explanation: Both monsters die in the same round, and Team B is fully eliminated.

Input: ([[10, 5]], [[8, 4]])

Expected Output: 1

Explanation: Team A defeats the enemy in 2 rounds and survives.

Hints

A state only needs to remember which Team A monsters are still unused, the current partially damaged Team A monster if any, and the current partially damaged Team B monster.
Many battle states repeat. Memoize both duel outcomes between two current monsters and the larger DP over battle states.

Quick Overview