How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at OpenAI.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at OpenAI during technical interviews.

Generate Data Labeling Schedules | OpenAI Coding Question

Quick Overview

This question evaluates combinatorial reasoning and constructive algorithm design for generating constrained assignment schedules, including ensuring each human appears a minimum number of times, preventing duplicate human-task pairs, and managing valid ID ranges.

Generate Data Labeling Schedules

Company: OpenAI

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

A data labeling platform manages tasks, AI models, and human annotators. Each entity has a 1-indexed string ID. You are given four integers: `totalTask`, `totalModel`, `totalHuman`, and `k`. - Tasks are labeled `"t1"`, `"t2"`, ..., `"t{totalTask}"`. - Models are labeled `"m1"`, `"m2"`, ..., `"m{totalModel}"`. - Human annotators are labeled `"h1"`, `"h2"`, ..., `"h{totalHuman}"`. An assignment is represented as `[task, model, human]`, meaning the specified human annotates the specified task using the specified model. A task may appear in multiple assignments, so the same task can be annotated by multiple humans and paired with multiple models. Return any valid schedule as a list of assignments such that: 1. Each human appears in at least `k` assignments. 2. The same human cannot annotate the same task more than once. 3. Every assignment uses valid task, model, and human IDs within the given ranges. If no valid schedule exists, return an empty list. Constraints for the basic version: - `1 <= totalTask, totalModel, totalHuman <= 100` - `0 <= k <= totalTask` Example 1: Input: ```text totalTask = 3, totalModel = 1, totalHuman = 2, k = 2 ``` One valid output: ```text [["t1", "m1", "h1"], ["t2", "m1", "h2"], ["t2", "m1", "h1"], ["t3", "m1", "h2"]] ``` Explanation: Each human annotates 2 distinct tasks. Human `"h1"` works on `"t1"` and `"t2"`, while `"h2"` works on `"t2"` and `"t3"`. No human annotates the same task twice. Example 2: Input: ```text totalTask = 1, totalModel = 1, totalHuman = 1, k = 1 ``` One valid output: ```text [["t1", "m1", "h1"]] ``` Example 3: Input: ```text totalTask = 2, totalModel = 3, totalHuman = 3, k = 2 ``` One valid output: ```text [["t1", "m1", "h1"], ["t2", "m1", "h2"], ["t1", "m1", "h3"], ["t2", "m1", "h1"], ["t1", "m1", "h2"], ["t2", "m1", "h3"]] ``` Follow-up: return a prefix-balanced schedule. A prefix is the first `x` assignments in the schedule for any `x >= 1`. For example, a schedule with 4 assignments has 4 prefixes: the first assignment, the first 2 assignments, the first 3 assignments, and the full schedule. The balance requirements below must hold for every prefix, not only for the final schedule. A schedule is prefix-balanced if it satisfies both conditions for every prefix: 1. Task-model balance: For each task, the number of times that task is assigned to each model must differ by at most 1, including models with zero uses. Formally, for every task, `max_count(task, model) - min_count(task, model) <= 1`. 2. Human-model balance: For each human, the number of times that human uses each model must differ by at most 1, including models with zero uses. Formally, for every human, `max_count(human, model) - min_count(human, model) <= 1`. Return any valid prefix-balanced schedule. If no such schedule exists, return an empty list. Constraints for the follow-up: - `1 <= totalTask, totalModel, totalHuman <= 10` - `0 <= k <= totalTask` Follow-up example 1: Input: ```text totalTask = 3, totalModel = 2, totalHuman = 4, k = 2 ``` One valid output: ```text [["t1", "m1", "h1"], ["t1", "m2", "h2"], ["t1", "m1", "h3"], ["t1", "m2", "h4"], ["t2", "m2", "h1"], ["t2", "m1", "h2"], ["t2", "m2", "h3"], ["t2", "m1", "h4"]] ``` Follow-up example 2: Input: ```text totalTask = 1, totalModel = 1, totalHuman = 1, k = 1 ``` One valid output: ```text [["t1", "m1", "h1"]] ``` Follow-up example 3: Input: ```text totalTask = 3, totalModel = 1, totalHuman = 3, k = 2 ``` One valid output: ```text [["t1", "m1", "h1"], ["t1", "m1", "h2"], ["t1", "m1", "h3"], ["t2", "m1", "h1"], ["t2", "m1", "h2"], ["t2", "m1", "h3"]] ```

Quick Answer: This question evaluates combinatorial reasoning and constructive algorithm design for generating constrained assignment schedules, including ensuring each human appears a minimum number of times, preventing duplicate human-task pairs, and managing valid ID ranges.

Part 1: Generate Any Valid Labeling Schedule

Constraints

1 <= totalTask, totalModel, totalHuman <= 100
0 <= k <= totalTask

Examples

Input: (3, 1, 2, 2)

Expected Output: [['t1', 'm1', 'h1'], ['t2', 'm1', 'h1'], ['t2', 'm1', 'h2'], ['t3', 'm1', 'h2']]

Explanation: Each human gets 2 distinct tasks. The reference solution rotates the starting task by human index.

Input: (1, 1, 1, 1)

Expected Output: [['t1', 'm1', 'h1']]

Explanation: Single task, single model, single human.

Input: (2, 3, 3, 2)

Expected Output: [['t1', 'm1', 'h1'], ['t2', 'm1', 'h1'], ['t2', 'm1', 'h2'], ['t1', 'm1', 'h2'], ['t1', 'm1', 'h3'], ['t2', 'm1', 'h3']]

Explanation: Even though there are 3 models, using only 'm1' is still valid.

Input: (4, 2, 3, 0)

Expected Output: []

Explanation: If k = 0, no assignments are required, so the empty schedule is valid.

Hints

You are not required to use every model. Reusing one valid model is enough.
To give each human k distinct tasks, try rotating task indices by the human number.

Part 2: Generate a Prefix-Balanced Labeling Schedule

You are given four integers totalTask, totalModel, totalHuman, and k. Tasks are named 't1' through 't{totalTask}', models are named 'm1' through 'm{totalModel}', and humans are named 'h1' through 'h{totalHuman}'. An assignment is a list [taskId, modelId, humanId]. Return a schedule that satisfies all rules from Part 1 and is also prefix-balanced. A prefix is the first x assignments for any x >= 1. For every prefix: (1) for each task, the number of times that task is assigned to each model must differ by at most 1, including models with zero uses; and (2) for each human, the number of times that human uses each model must differ by at most 1, including models with zero uses. Also, each human must appear in at least k assignments, no human-task pair may repeat, and all IDs must be valid. If no valid schedule exists, return []. Any valid answer is acceptable in general; the expected outputs below match the reference solution.

Constraints

1 <= totalTask, totalModel, totalHuman <= 10
0 <= k <= totalTask

Examples

Input: (3, 2, 4, 2)

Expected Output: [['t1', 'm1', 'h1'], ['t1', 'm2', 'h2'], ['t1', 'm1', 'h3'], ['t1', 'm2', 'h4'], ['t2', 'm2', 'h1'], ['t2', 'm1', 'h2'], ['t2', 'm2', 'h3'], ['t2', 'm1', 'h4']]

Explanation: The first task uses models in alternating order across humans, and the second round shifts the model pattern so each human stays balanced too.

Input: (1, 1, 1, 1)

Expected Output: [['t1', 'm1', 'h1']]

Explanation: With only one model, every prefix is automatically balanced.

Input: (3, 1, 3, 2)

Expected Output: [['t1', 'm1', 'h1'], ['t1', 'm1', 'h2'], ['t1', 'm1', 'h3'], ['t2', 'm1', 'h1'], ['t2', 'm1', 'h2'], ['t2', 'm1', 'h3']]

Explanation: One model makes the balance conditions trivial, so assigning humans round by round is valid.

Input: (4, 3, 2, 3)

Expected Output: [['t1', 'm1', 'h1'], ['t1', 'm2', 'h2'], ['t2', 'm2', 'h1'], ['t2', 'm3', 'h2'], ['t3', 'm3', 'h1'], ['t3', 'm1', 'h2']]

Explanation: Each human cycles through models, and each task block is also balanced across models for every prefix.

Input: (5, 3, 2, 0)

Expected Output: []

Explanation: If k = 0, the empty schedule already satisfies all requirements.

Hints

For one fixed task or one fixed human, a balanced sequence over models looks like a repeating cycle where counts stay as equal as possible.
Try building the schedule in k rounds, giving every human exactly one new task per round and choosing models in a cyclic pattern.

Quick Overview