PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Glean

Simulate document assignment to indexers

Last updated: Mar 29, 2026

Quick Overview

This question evaluates simulation of parallel worker scheduling, time-based queue processing, modular assignment logic, and aggregation/ranking (top-k and percentage) competencies for tracking per-worker workloads.

  • hard
  • Glean
  • Coding & Algorithms
  • Software Engineer

Simulate document assignment to indexers

Company: Glean

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Technical Screen

You are given a system with `m` parallel indexers (workers), numbered from `0` to `m-1`. Documents arrive over time and are placed in a processing queue. Each document, when processed by an indexer, occupies that indexer exclusively for a given processing duration. Document assignment works as follows (assume 0-based indexing for documents): 1. Document `i` arrives at time `queue_time[i]` and requires `processing_time[i]` units of processing time. 2. Let `base = i % m`. First, try to assign the document to indexer `base`. 3. An indexer `j` is **available** for a document arriving at time `t` if its current processing (if any) completes no later than time `t`. 4. If indexer `base` is **available** at time `queue_time[i]`, assign the document to indexer `base`. 5. Otherwise, scan indexers in increasing order of indexer number, wrapping around modulo `m`: `(base + 1) % m`, `(base + 2) % m`, ..., `(base + m - 1) % m`. Assign the document to the **first** indexer found that is available at `queue_time[i]`. 6. If **no** indexer is available at `queue_time[i]`, then this document is **dropped** and never processed. 7. Once a document `i` is assigned to indexer `j` at time `queue_time[i]`, that indexer becomes busy until time `queue_time[i] + processing_time[i]`. You are given: - An integer `m` — the number of indexers. - An integer array `queue_time` of length `n`, where `queue_time[i]` is the time when document `i` is queued. The array is in strictly increasing order (i.e., `queue_time[i] < queue_time[i+1]`). - An integer array `processing_time` of length `n`, where `processing_time[i]` is the duration needed to process document `i`. Assume `n = len(queue_time) = len(processing_time)`. **Step 1** Design an algorithm that, using the above scheduling rules, computes: 1. The total number of documents that are processed successfully (i.e., not dropped). 2. The indexer that processed the most documents. If multiple indexers are tied for the maximum number of processed documents, you may return any one of them. **Step 2** Extend your algorithm to also compute, for a given integer `k` (where `1 \le k \le m`): 1. The set of the top `k` indexers that processed the largest number of documents. If there are ties, any order or any valid choice among tied indexers is acceptable. 2. The percentage of all documents (including those dropped) that were processed by these top `k` indexers, as a value between 0% and 100%. You should: - Describe the algorithm to perform the simulation and compute all required outputs. - Aim for an efficient solution in terms of time and space complexity. --- **Example** Input: - `m = 3` - `queue_time = [1, 2, 3, 7]` - `processing_time = [5, 4, 3, 2]` - Suppose `k = 2` for Step 2. Processing simulation: - Document 0 arrives at time 1, `base = 0`. All indexers are free. Assign to indexer 0, which will be busy until time `1 + 5 = 6`. - Document 1 arrives at time 2, `base = 1`. Indexer 1 is free. Assign to indexer 1, busy until `2 + 4 = 6`. - Document 2 arrives at time 3, `base = 2`. Indexer 2 is free. Assign to indexer 2, busy until `3 + 3 = 6`. - Document 3 arrives at time 7, `base = 0`. Indexer 0 is free (available from time 6). Assign to indexer 0, busy until `7 + 2 = 9`. All documents are successfully processed. Indexer 0 processed 2 documents; indexers 1 and 2 each processed 1 document. Output: - Documents processed successfully: `4` - Indexer that processed most documents: `0` - Top 2 indexers (any valid order with ties): e.g., `[0, 1]` or `[0, 2]` - Percentage of documents processed by these 2 indexers: `75%` (3 out of 4 documents)

Quick Answer: This question evaluates simulation of parallel worker scheduling, time-based queue processing, modular assignment logic, and aggregation/ranking (top-k and percentage) competencies for tracking per-worker workloads.

Related Interview Questions

  • Implement a Byte Pair Encoding (BPE) Tokenizer - Glean (medium)
  • Return Top Department Suggestions - Glean (medium)
  • Implement Rate-Limited Wikipedia Crawler - Glean (medium)
  • Find Earliest Train Route - Glean (medium)
  • Search Words in a Character Grid - Glean (hard)
Glean logo
Glean
Nov 19, 2025, 12:00 AM
Software Engineer
Technical Screen
Coding & Algorithms
33
0

You are given a system with m parallel indexers (workers), numbered from 0 to m-1. Documents arrive over time and are placed in a processing queue. Each document, when processed by an indexer, occupies that indexer exclusively for a given processing duration.

Document assignment works as follows (assume 0-based indexing for documents):

  1. Document i arrives at time queue_time[i] and requires processing_time[i] units of processing time.
  2. Let base = i % m . First, try to assign the document to indexer base .
  3. An indexer j is available for a document arriving at time t if its current processing (if any) completes no later than time t .
  4. If indexer base is available at time queue_time[i] , assign the document to indexer base .
  5. Otherwise, scan indexers in increasing order of indexer number, wrapping around modulo m : (base + 1) % m , (base + 2) % m , ..., (base + m - 1) % m . Assign the document to the first indexer found that is available at queue_time[i] .
  6. If no indexer is available at queue_time[i] , then this document is dropped and never processed.
  7. Once a document i is assigned to indexer j at time queue_time[i] , that indexer becomes busy until time queue_time[i] + processing_time[i] .

You are given:

  • An integer m — the number of indexers.
  • An integer array queue_time of length n , where queue_time[i] is the time when document i is queued. The array is in strictly increasing order (i.e., queue_time[i] < queue_time[i+1] ).
  • An integer array processing_time of length n , where processing_time[i] is the duration needed to process document i .

Assume n = len(queue_time) = len(processing_time).

Step 1 Design an algorithm that, using the above scheduling rules, computes:

  1. The total number of documents that are processed successfully (i.e., not dropped).
  2. The indexer that processed the most documents. If multiple indexers are tied for the maximum number of processed documents, you may return any one of them.

Step 2 Extend your algorithm to also compute, for a given integer k (where 1 \le k \le m):

  1. The set of the top k indexers that processed the largest number of documents. If there are ties, any order or any valid choice among tied indexers is acceptable.
  2. The percentage of all documents (including those dropped) that were processed by these top k indexers, as a value between 0% and 100%.

You should:

  • Describe the algorithm to perform the simulation and compute all required outputs.
  • Aim for an efficient solution in terms of time and space complexity.

Example

Input:

  • m = 3
  • queue_time = [1, 2, 3, 7]
  • processing_time = [5, 4, 3, 2]
  • Suppose k = 2 for Step 2.

Processing simulation:

  • Document 0 arrives at time 1, base = 0 . All indexers are free. Assign to indexer 0, which will be busy until time 1 + 5 = 6 .
  • Document 1 arrives at time 2, base = 1 . Indexer 1 is free. Assign to indexer 1, busy until 2 + 4 = 6 .
  • Document 2 arrives at time 3, base = 2 . Indexer 2 is free. Assign to indexer 2, busy until 3 + 3 = 6 .
  • Document 3 arrives at time 7, base = 0 . Indexer 0 is free (available from time 6). Assign to indexer 0, busy until 7 + 2 = 9 .

All documents are successfully processed. Indexer 0 processed 2 documents; indexers 1 and 2 each processed 1 document.

Output:

  • Documents processed successfully: 4
  • Indexer that processed most documents: 0
  • Top 2 indexers (any valid order with ties): e.g., [0, 1] or [0, 2]
  • Percentage of documents processed by these 2 indexers: 75% (3 out of 4 documents)

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Glean•More Software Engineer•Glean Software Engineer•Glean Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.