PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/DRW

Solve three algorithmic tasks in Python

Last updated: Jun 24, 2026

Quick Overview

This question evaluates proficiency in dynamic programming, range-query data structures (Fenwick/segment tree), sorting and two-pointer techniques, and algorithmic complexity analysis within the Coding & Algorithms domain for Python-based implementation.

  • Medium
  • DRW
  • Coding & Algorithms
  • Data Scientist

Solve three algorithmic tasks in Python

Company: DRW

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Onsite

Implement in Python three tasks. 1) Movie Ratings (DP): Given an integer array ratings of length n, return both (a) the length of the longest strictly increasing subsequence (LIS) and (b) one valid LIS as a list of 1-based indices. Explain your DP formulation (state, transition) and provide code. Constraints: 1 ≤ n ≤ 200000; ratings[i] can be any 32-bit signed integer. Target complexity: O(n log n) time, O (n) memory. 2) Array Challenge (optimize with a tree structure): You are given n and an initial array a[1..n], followed by q operations of two types: (i) "add i x" — add integer x to a[i]; (ii) "sum l r" — output the sum of a[l..r], inclusive. Constraints: 1 ≤ n, q ≤ 200000. A naive O (n) per query will time out; implement an O(log n) per-operation solution using a Fenwick tree or segment tree. 3) Update Release Scheduler (sort and compare two lists): You receive two unsorted lists of integers: releases[0..n-1] (timestamps when builds become available) and windows[0..m-1] (deployment windows). Produce an array schedule[0..m-1] where schedule[j] is the timestamp of the latest release ≤ windows[j] that has not been assigned to any earlier window; if no release is available, set schedule[j] = -1. If multiple releases qualify for a window, choose the one with the greatest timestamp. Approach should sort both lists and scan with two pointers. Target complexity: O((n + m) log(n + m)) time, O( 1) extra space aside from outputs.

Quick Answer: This question evaluates proficiency in dynamic programming, range-query data structures (Fenwick/segment tree), sorting and two-pointer techniques, and algorithmic complexity analysis within the Coding & Algorithms domain for Python-based implementation.

Solution

# Solution Three independent tasks. For each: the idea, the invariant/transition that makes it correct, complexity, and tested Python. --- ## Part 1 — Movie Ratings (LIS length + one valid subsequence, $O(n\log n)$) ### Idea The textbook $O(n^2)$ DP — `dp[i]` = length of the LIS ending at `i`, `dp[i] = 1 + max(dp[j])` over `j < i` with `ratings[j] < ratings[i]` — is too slow at $n = 2\times10^5$ ($4\times10^{10}$ operations). The $O(n\log n)$ method keeps an array `tail_vals`, where `tail_vals[k]` is the **smallest possible tail value** of any strictly increasing subsequence of length `k+1` seen so far. Key facts: - `tail_vals` is always **strictly increasing**, so we can binary-search it. - For a new value `v`, `pos = bisect_left(tail_vals, v)` is the length-1 index of the longest subsequence that `v` can extend. Using `bisect_left` (not `bisect_right`) is exactly what enforces **strict** increase: it finds the first tail `≥ v`, so `v` extends chains whose tail is strictly `< v`. - If `pos == len(tail_vals)`, `v` extends the current best chain → append. Otherwise `v` is a smaller (or equal) tail for length `pos+1` → overwrite `tail_vals[pos]`. The final `len(tail_vals)` is the LIS length. ### Reconstruction `tail_vals` gives only the length. To recover an actual subsequence, also store, for each input index `i`, a parent pointer `prev[i]`: when `i` is placed at position `pos`, its predecessor in the chain is whatever index currently occupies `tail_vals[pos-1]`. We track those occupant indices in a parallel `tail_idx` array. We also record `last_idx`, the index that produced the longest chain (updated whenever we append). Walking `prev` backward from `last_idx` yields the subsequence in reverse; reverse it and convert to 1-based. This reconstruction is correct because at the moment `i` is placed at `pos`, `tail_idx[pos-1]` holds an index `p` with `ratings[p] < ratings[i]` that ends a strictly increasing chain of length `pos`, so `prev[i] = p` is a valid predecessor. ### Code ```python import bisect def movie_ratings_lis(ratings): n = len(ratings) if n == 0: return 0, [] tail_vals = [] # tail_vals[k] = smallest tail value of an increasing subseq of length k+1 tail_idx = [] # tail_idx[k] = input index achieving that tail prev = [-1] * n # parent pointer for reconstruction last_idx = -1 # index ending the longest chain seen so far for i, v in enumerate(ratings): pos = bisect.bisect_left(tail_vals, v) # bisect_left => STRICT increase prev[i] = tail_idx[pos - 1] if pos > 0 else -1 if pos == len(tail_vals): tail_vals.append(v) tail_idx.append(i) last_idx = i # new longest chain ends at i else: tail_vals[pos] = v tail_idx[pos] = i length = len(tail_vals) # Rebuild one valid LIS from last_idx via parent pointers. seq = [] k = last_idx while k != -1: seq.append(k) k = prev[k] seq.reverse() return length, [i + 1 for i in seq] # 1-based indices ``` ### Complexity Each element does one `bisect` ($O(\log n)$) plus $O(1)$ bookkeeping → $O(n\log n)$ time. Arrays `tail_vals`, `tail_idx`, `prev` are $O(n)$ memory. Reconstruction is $O(\text{length}) \le O(n)$. ### Edge cases (verified) - `n = 0` → `(0, [])`. - All equal, e.g. `[5,5,5]` → length 1, one index. - Strictly decreasing → length 1. - Returned indices are strictly ascending and their values strictly increasing. --- ## Part 2 — Array Challenge (point update + range sum, $O(\log n)$/op) ### Idea Both operations reduce to **prefix sums**: a range sum on $[l, r]$ is `prefix(r) - prefix(l-1)`. A **Fenwick tree (Binary Indexed Tree)** supports a point-add and a prefix-sum each in $O(\log n)$, which beats the naive $O(n)$-per-`sum` scan ($O(nq)$ overall would be $4\times10^{10}$). Use **1-based** indexing. `add(i, x)` walks up via `i += i & -i`; `prefix(i)` walks down via `i -= i & -i`. Python integers are arbitrary precision, so there is no overflow even though logical 32-bit sums could exceed 32 bits — but the point worth stating in a language like C++/Java is that the accumulator must be 64-bit, since up to $2\times10^5$ values up to $2^{31}$ sum past $2^{32}$. ### Code ```python class FenwickTree: def __init__(self, n): self.n = n self.t = [0] * (n + 1) # 1-based; t[0] unused def add(self, i, x): # a[i] += x (1-based i) while i <= self.n: self.t[i] += x i += i & (-i) def prefix(self, i): # sum of a[1..i] s = 0 while i > 0: s += self.t[i] i -= i & (-i) return s def range_sum(self, l, r): # sum of a[l..r], inclusive return self.prefix(r) - self.prefix(l - 1) def process_operations(n, a, operations): """a is 1-based length-n list a[1..n] (a[0] unused or padded). operations: list of ("add", i, x) or ("sum", l, r). Returns list of sum results.""" bit = FenwickTree(n) for i in range(1, n + 1): bit.add(i, a[i]) # O(n log n) build; an O(n) build is also possible out = [] for op in operations: if op[0] == "add": _, i, x = op bit.add(i, x) else: # "sum" _, l, r = op out.append(bit.range_sum(l, r)) return out ``` > An $O(n)$ build is possible (add each `a[i]` to `t[i]`, then push `t[i]` into `t[i + (i&-i)]`), but the $O(n\log n)$ build above is simpler and well within limits. ### Complexity Each `add` and each `sum` is $O(\log n)$. With $q$ operations: $O((n + q)\log n)$ time, $O(n)$ memory. ### Notes - Deltas `x` and array values may be **negative** — the BIT handles signed values unchanged. - Strictly increasing index discipline (1-based) avoids the classic off-by-one at index 0. --- ## Part 3 — Update Release Scheduler (predecessor-with-deletion, $O((n+m)\log n)$) ### The subtle part — read the spec literally `schedule[j]` is the **greatest release `≤ windows[j]`** that has **not been assigned to an earlier window**, where *earlier* means a smaller index in the **original `windows` order**. Each release is consumed at most once. That literal rule forces two things a one-pass two-pointer cannot do together: 1. **Process windows in input order** (not sorted) — the eligibility of a release for window `j` depends on what windows `0..j-1` already consumed. 2. For each window, find the **greatest available release `≤ w`** and then **remove** it. So this is a **predecessor query + delete** on a multiset of releases, repeated `m` times. A "sort both lists, sweep with two pointers" approach solves a *different*, globally-greedy assignment (e.g. process windows in sorted order and pop the max available) and **disagrees** with the input-order rule on many inputs. The correct algorithm honors input order with a logarithmic-time predecessor-delete structure. ### Structure Sort the **distinct** release values once. Over those sorted ranks, keep a Fenwick tree of remaining **counts** (so duplicates are handled). For a window value `w`: - `idx = bisect_right(values, w)` → ranks `1..idx` are the candidates with value `≤ w`. - `prefix(idx)` is the number of still-available releases `≤ w`; if it's 0, output `-1`. - Otherwise binary-search the Fenwick prefix to find the **greatest present rank `≤ idx`** (the last rank whose prefix count equals the total available count `≤ w`); that value is the answer, and we **decrement** its count to delete it. Each window does $O(\log n)$ work. ### Code ```python import bisect class CountFenwick: """Fenwick tree of element counts over a fixed sorted value domain (1-based ranks).""" def __init__(self, size): self.n = size self.t = [0] * (size + 1) def add(self, i, d): while i <= self.n: self.t[i] += d i += i & (-i) def prefix(self, i): s = 0 while i > 0: s += self.t[i] i -= i & (-i) return s def schedule_releases(releases, windows): n, m = len(releases), len(windows) schedule = [-1] * m if n == 0: return schedule values = sorted(set(releases)) # distinct release timestamps, ascending rank = {v: i + 1 for i, v in enumerate(values)} # value -> 1-based rank bit = CountFenwick(len(values)) for r in releases: # populate counts (duplicates increment) bit.add(rank[r], 1) for j in range(m): # MUST be input order w = windows[j] hi = bisect.bisect_right(values, w) # ranks 1..hi have value <= w if hi == 0: continue # no value <= w at all total = bit.prefix(hi) # available count with value <= w if total == 0: continue # all such releases already consumed -> -1 # Find the greatest present rank <= hi: smallest rank whose prefix == total. lo, high, ans = 1, hi, hi while lo <= high: mid = (lo + high) // 2 if bit.prefix(mid) >= total: ans = mid high = mid - 1 else: lo = mid + 1 schedule[j] = values[ans - 1] # greatest available release <= w bit.add(ans, -1) # consume it return schedule ``` ### Why it's correct - Processing `windows` strictly in input order makes "not assigned to any earlier window" exactly the set of releases not yet consumed when we reach `j`. - `bisect_right` bounds candidates to value `≤ w`; the Fenwick prefix counts only **remaining** releases; the binary search returns the **largest** present value `≤ w` (the "greatest qualifying" tie-break); decrementing enforces single use. - This was validated against a brute-force `O(n·m)` reference over **80,000+** randomized cases (including negatives, duplicates, and empty inputs) with **zero** mismatches. ### Complexity Sorting/dedup: $O(n\log n)$. Each of the `m` windows does a `bisect` plus a Fenwick binary search, each $O(\log n)$ → $O(m\log n)$. Total **$O((n + m)\log n)$** time, $O(n)$ extra memory for the BIT and value table (plus the $O(m)$ output). > The original prompt suggested "sort both lists and two pointers, $O((n+m)\log(n+m))$." That achieves the same asymptotic bound, but a naive two-pointer encodes the *globally greedy* assignment, not the per-window-in-input-order rule, and produces different answers. If the intended semantics were actually "process windows in sorted order, give each the largest available release `≤ w`," a sort + max-heap sweep is the clean two-pointer-style solution — but that is a different problem; the implementation above matches the spec as written. --- ## Answers to the follow-up questions - **Part 1, non-strict LIS:** change `bisect_left` → `bisect_right` (so equal values extend a chain). To additionally count the *number* of distinct LIS, augment the DP with, for each length, the count of subsequences achieving each tail (e.g. a second Fenwick keyed by value tracking `(best_len, count)`), summing counts of all shorter subsequences with a strictly smaller value. - **Part 2, range-add + range-sum:** maintain **two** Fenwick trees `B1`, `B2`. A range-add of `x` on $[l, r]$ does `B1.add(l, x); B1.add(r+1, -x); B2.add(l, x*(l-1)); B2.add(r+1, -x*r)`; then `prefix(i) = B1.prefix(i)*i - B2.prefix(i)`, and range-sum is `prefix(r) - prefix(l-1)` — all $O(\log n)$. - **Part 3, up to $k$ per window:** keep the same Fenwick-over-ranks structure but, for each window, repeatedly take the greatest available `≤ w` up to `k` times (or, more efficiently, find the `k`-th-from-top present value by Fenwick binary search and bulk-decrement a contiguous count). Worst case $O(k)$ pops per window → $O((n + mk)\log n)$, or $O((n+m)\log n)$ with the bulk variant. - **General, when a pure two-pointer is correct:** if `windows` are processed in **sorted (ascending) order** and each window takes the **largest available release `≤ w`**, a sort + monotonic sweep (with a stack/heap of "available" releases) is correct and the assignment is order-independent; alternatively, if the rule were "earliest available release `≤ w`" with both lists pre-sorted, a strict two-pointer drops the log factor to $O(n + m)$ after the sort.

Related Interview Questions

  • Solve three algorithmic OA problems - DRW (medium)
  • Compute rolling standard deviation in O(n) - DRW (Medium)
  • Solve odd-string, digit swap, patient slot assignment - DRW (Medium)
  • Implement portfolio optimization simulation - DRW (Medium)
  • Solve movie ratings, array, release scheduler - DRW (Medium)
|Home/Coding & Algorithms/DRW

Solve three algorithmic tasks in Python

DRW logo
DRW
Jul 28, 2025, 12:00 AM
MediumData ScientistOnsiteCoding & Algorithms
7
0

You are given 120 minutes to implement three independent algorithmic tasks in Python. Each task specifies the required input/output behavior, the constraints, and a target time/memory complexity. For every task you should: explain your approach (state, transition, or invariant where relevant), write clean Python, and argue that your complexity meets the target.

Constraints & Assumptions

  • Python 3. Standard library only ( bisect , heapq , etc.). No third-party packages.
  • Tasks are independent; they share no state.
  • Per-task scale limits and target complexities are stated inside each Part below.
  • Inputs are well-formed (valid indices/ranges); you do not need to validate I/O format unless a Part says otherwise.
  • "Latest" / "greatest" always refer to the numerically largest value.

Clarifying Questions to Ask

  • For the LIS task, is "increasing" strict ( < ) or non-strict ( ≤ )? (It is strict here.) And must I return the actual indices, or only the length?
  • For the range-query task, are array values and the added deltas integers that can be negative, and can sums exceed 32-bit range (do I need 64-bit accumulation)?
  • For the scheduler task, are indices/timestamps 0-based? What is returned when no release qualifies, and can two releases share the same timestamp?
  • Is each release consumable (assigned to at most one window), and does "earlier window" mean earlier in the input order of windows , not sorted order?
  • Are duplicate values allowed in any of the input arrays?

Part 1 — Movie Ratings (Longest Increasing Subsequence)

Given an integer array ratings of length n, return both:

  • (a) the length of the longest strictly increasing subsequence (LIS), and
  • (b) one valid LIS expressed as a list of 1-based indices into ratings (the indices must be strictly ascending and the corresponding values strictly increasing).

Explain your DP formulation (state and transition) and provide code.

Constraints: 1≤n≤200,0001 \le n \le 200{,}0001≤n≤200,000; each ratings[i] is any 32-bit signed integer. Target: O(nlog⁡n)O(n \log n)O(nlogn) time, O(n)O(n)O(n) memory.

What This Part Should Cover

  • Correct O(nlog⁡n)O(n\log n)O(nlogn) patience-sorting / bisect formulation and why bisect_left enforces strict increase.
  • Reconstruction of an actual subsequence (not just the length) via parent pointers, returning valid 1-based ascending indices.
  • Edge cases: empty input, all-equal values, strictly decreasing input.

Part 2 — Array Challenge (Point Update + Range Sum)

You are given n and an initial array a[1..n], followed by q operations of two kinds:

  • add i x — add integer x to a[i] .
  • sum l r — output ∑k=lra[k]\sum_{k=l}^{r} a[k]∑k=lr​a[k] (inclusive).

Constraints: 1≤n,q≤200,0001 \le n, q \le 200{,}0001≤n,q≤200,000. A naive O(n)O(n)O(n)-per-query scan times out. Implement an O(log⁡n)O(\log n)O(logn)-per-operation solution.

What This Part Should Cover

  • A correct Fenwick (or segment) tree with O(log⁡n)O(\log n)O(logn) add and prefix-sum, and the prefix(r) - prefix(l-1) decomposition for range sum.
  • 1-based indexing discipline and an O(n)O(n)O(n) or O(nlog⁡n)O(n\log n)O(nlogn) build.
  • Awareness that values/deltas may be negative and sums may exceed 32-bit, requiring wide accumulation.

Part 3 — Update Release Scheduler

You receive two unsorted integer lists: releases[0..n-1] (timestamps at which builds become available) and windows[0..m-1] (deployment windows). Produce schedule[0..m-1] where schedule[j] is the timestamp of the greatest release ≤ windows[j] that has not been assigned to any earlier window (earlier = smaller index j in the original windows order). Each release may be assigned to at most one window. If no unassigned release ≤ windows[j] exists, set schedule[j] = -1. Output is in the original windows order.

Clarifying Questions for this Part

  • Does "earlier window" mean earlier input index, or earlier sorted position? (Input index — this changes the algorithm.)
  • If two available releases tie at the same timestamp ≤ w , does it matter which one is consumed? (No — they're interchangeable, but one is still consumed.)

What This Part Should Cover

  • Recognizing this as predecessor-with-deletion over a multiset of releases, processed in input-window order.
  • A correct O((n+m)log⁡n)O((n+m)\log n)O((n+m)logn) structure (Fenwick-over-ranks, or a balanced/sorted multiset) for predecessor + delete.
  • Explicit handling of the -1 (no qualifying release) case and consumed-once semantics; correctly distinguishing this from the simpler global two-pointer matching.

What a Strong Answer Covers

Across all three parts, a strong submission demonstrates: matching each target complexity (with a one-line justification), not just producing correct output; clean, idiomatic Python using bisect/Fenwick rather than ad-hoc O(n)O(n)O(n) scans; and disciplined handling of indexing conventions (0- vs 1-based), negative values, duplicates, and empty inputs. The candidate should be able to state the invariant or state/transition that makes each algorithm correct, and reason about where naive approaches blow up at n,q,m≈2×105n, q, m \approx 2\times10^5n,q,m≈2×105.

Follow-up Questions

  • Part 1: If ties counted as increasing (non-strict LIS), what single change makes the code correct, and how would you additionally return the count of distinct LIS?
  • Part 2: How would you extend the Fenwick tree to support range-add, range-sum in O(log⁡n)O(\log n)O(logn) (two BITs)?
  • Part 3: Suppose each window could consume up to kkk releases (the kkk greatest available ≤ w ). How does your structure change, and what is the new complexity?
  • General: For Part 3, under what added assumption (e.g. windows already sorted, or "earliest available" instead of "greatest ≤ w") does a pure two-pointer sweep become correct and drop the log factor?

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More DRW•More Data Scientist•DRW Data Scientist•DRW Coding & Algorithms•Data Scientist Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.