PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Upstart

Solve Five OA Coding Tasks

Last updated: Jun 21, 2026

Quick Overview

This multi-part problem evaluates programming fundamentals including string manipulation, list filtering and reversal, parsing, and simple geometric computations, with emphasis on correctness, edge-case handling, and implementation efficiency.

  • medium
  • Upstart
  • Coding & Algorithms
  • Software Engineer

Solve Five OA Coding Tasks

Company: Upstart

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Take-home Project

This is an online-assessment screen consisting of **five independent coding tasks**. Each task is self-contained — there is no shared state between them — and is graded against its own hidden test suite. Implement a function for every task below. The tasks escalate in subtlety: the first three are warm-up string/list manipulations, while the last two (digit-comparison and robots-file parsing) hide most of the edge cases that decide whether you pass. --- ### Constraints & Assumptions - Each task is graded independently; partial credit is awarded per task, so attempt all five even if one stumps you. - Inputs are well-formed for their type (Part 1 receives a string, Part 2/3 receive lists, Part 4 receives non-negative integers, Part 5 receives a list of strings). You do not need to validate types. - Part 3's input list is guaranteed non-empty. Coordinates may include negative or duplicate points. - Part 4's integers are non-negative and fit in a standard 64-bit integer; `statedSum` need not equal `num1 + num2` (that is the whole point). - Part 5's lines may contain leading/trailing whitespace, blank lines, comment-like lines, and `Disallow:`/`User-agent:` directives in any interleaving. Prefix matching (`"User-agent:"`, `"Disallow:"`) is exact and case-sensitive on the directive name. - Aim for linear time in the size of each input; none of these tasks needs better than $O(n)$. ### Clarifying Questions to Ask - General: Should every function **return** its result (not print it), and are there hidden tests for empty inputs (empty string, empty list, single line / single element)? - Part 1: Should a `'.'` in the middle (or at the start/end) of a group still be replaced before the trailing `'!'` is appended, and do tabs / newlines count as whitespace separators (not just the space character)? - Part 2: Is `0` treated as even, and may the input contain negative integers (which makes the language's modulo sign-convention relevant)? - Part 3: Is the required output `[minX, minY, width, height]` rather than `[minX, minY, maxX, maxY]`, and what are `width` / `height` when all points coincide? - Part 4: When `correct` and `statedSum` have different digit counts, do I pad the shorter conceptually with leading zeros and report **every** differing position — including high-order positions that exist only in the longer value? - Part 5: Must dedup preserve first-occurrence order across the whole file, and is matching truly case-sensitive for both the directive prefixes and the URL values? --- ### Part 1 — Add drama to text Given a string `s`, return a new string in which: - every period `'.'` is replaced with an exclamation mark `'!'`, and - one additional exclamation mark is appended after each **group**, where a *group* is a maximal contiguous run of non-whitespace characters. Whitespace between groups must be preserved exactly (the kind and amount of whitespace is unchanged). Only the non-whitespace runs gain a trailing `'!'`. Example: `"hi. there"` → `"hi!! there!"` (the first group `"hi."` becomes `"hi!"` after the period swap, then gains a trailing `!` → `"hi!!"`; the second group `"there"` gains a trailing `!`). ```hint Where to start Tokenize while remembering the gaps. Split the string into alternating *non-whitespace runs* and *whitespace runs* so you can transform each run and then re-stitch them in order — a regex like `\S+` for groups (or a single left-to-right scan tracking "am I currently inside a group?") makes the boundaries explicit. ``` ```hint Edge cases to nail Decide the behavior for: leading/trailing whitespace, an empty string, a string that is all whitespace (no groups → no extra `!`), and consecutive whitespace. The `'.'`→`'!'` replacement applies to *every* period, including ones in the middle of a group, **before** you append the per-group `!`. A lone `"."` is a group and becomes `"!!"`. ``` #### What a Strong Answer Covers - Tokenization that preserves whitespace runs verbatim (kind and amount) and only mutates the non-whitespace groups. - Correct ordering of the two transforms: replace `'.'`→`'!'` across the whole group, then append exactly one trailing `'!'`. - Boundary behavior: empty string, all-whitespace string, leading/trailing/consecutive whitespace, and a lone `"."`. ### Part 2 — Filter and reverse even numbers Given a list of integers `nums`, remove every odd number and return the remaining even numbers in the **reverse** of their original relative order. Example: `[1, 2, 3, 4, 6, 5]` → `[6, 4, 2]`. ```hint Approach Two independent steps: keep only the evens, then reverse the surviving order. Watch the modulo of negatives in your language (in Python `-4 % 2 == 0`, so negatives behave; in C/Java/JS `%` can return a negative remainder, so test the result against `0` rather than against `1`, or use a bitwise parity check). Remember `0` is even. ``` #### What a Strong Answer Covers - A correct parity test that survives negative inputs (compare `% 2 == 0`, never `== 1`, or use `(x & 1) == 0`) and treats `0` as even. - The output is reversed relative to the original even-element order, not the full-list order. - Empty / all-odd inputs return an empty list. ### Part 3 — Compute a bounding rectangle Given a non-empty list of coordinates, where each coordinate is a two-element list `[x, y]`, return the four-element list `[minX, minY, width, height]` where: - `minX` is the minimum x over all points, - `minY` is the minimum y over all points, - `width = maxX - minX`, - `height = maxY - minY`. ```hint Approach A single pass tracking `minX, maxX, minY, maxY` is enough — no need to materialize separate x/y lists. The output is `[minX, minY, maxX - minX, maxY - minY]`, **not** `[minX, minY, maxX, maxY]`; a single point (or all coincident points) yields width/height of `0`. ``` #### What a Strong Answer Covers - The output **shape** is `[minX, minY, width, height]`, where width/height are spans (differences), not the max coordinates. - A single-pass extrema scan with correct initialization (the non-empty guarantee makes the first point a safe seed). - Negative, duplicate, and coincident points (zero-area rectangle) handled correctly. ### Part 4 — Find incorrect digit positions in a sum Given three non-negative integers `num1`, `num2`, and `statedSum`: - Compute the correct value `correct = num1 + num2`. - Compare `statedSum` against `correct` **digit by digit, from the ones place upward**: index `0` is the ones digit, index `1` the tens, index `2` the hundreds, and so on. - Return the list of zero-based indexes where the two digits differ (in ascending order). - The two values may have different digit lengths. Treat any missing higher-order digit (a position present in one value but past the most-significant digit of the other) as `0`, and keep comparing until **every** digit position of **both** `correct` and `statedSum` has been examined. Example: `num1 = 99`, `num2 = 1`, `statedSum = 190` → `correct = 100`. Comparing per position: index 0 → `0` vs `0` (match), index 1 → `0` vs `9` (differ), index 2 → `1` vs `1` (match). Result: `[1]`. ```hint Where to start Don't convert to strings and zip — string indexing runs most-significant-digit-first, but the index here counts from the *ones* place. Think about how to read out the ones digit, then the tens, then the hundreds, of each number directly (peel with `% 10` and `// 10`), so position `0` is the first thing you compare. ``` ```hint Loop bound The two values can have different digit counts, so ask yourself when the comparison is actually finished — stopping as soon as the *shorter* number runs out silently drops real mismatches in the leading digits of the longer one (e.g. `correct = 100` vs `statedSum = 5` differs at index 2, the hundreds place, which you only see if you don't quit early). Treat a position that exists in one number but not the other as a digit `0` there. Also sanity-check the all-zero case `correct == statedSum == 0`. ``` #### What a Strong Answer Covers - Comparison anchored at the **ones** place (index 0), not at the most-significant digit — i.e. it does not string-zip from the left. - The loop runs to the **longer** of the two values, substituting `0` for the exhausted number's missing high-order digits. - The all-equal case returns `[]`, the all-zero case returns `[]`, and results come out in ascending index order. ### Part 5 — Extract disallowed URLs for bot user agents Given `lines`, a list of strings that together form the contents of a robots-style text file (one element per line), extract disallowed URLs as follows: - The file is divided into **sections**. A new section begins at any line that starts with the exact prefix `"User-agent:"`. The remainder of that line (after the prefix), trimmed of surrounding spaces, is the section's **user-agent value**. - A section is **relevant** if its user-agent value either is exactly `"*"` **or** ends with the exact suffix `"Bot"`. - Inside a relevant section, for each line that starts with the exact prefix `"Disallow:"`, collect the URL value — the remainder after the prefix, trimmed of surrounding spaces. - A section's scope ends when the next `"User-agent:"` line begins or the file ends. Lines belonging to a non-relevant section are ignored, as is any `"Disallow:"` that appears before the first `"User-agent:"` line. - Return all collected URLs with duplicates removed, **preserving the order of first occurrence**. Matching is **case-sensitive**, so `"/A"` and `"/a"` are distinct URLs. ```hint State machine This is a one-pass scan where your state is "am I currently inside a section I care about?". Each `"User-agent:"` line opens a new section, so it's the moment to re-evaluate that state from the section's value; each `"Disallow:"` line is only interesting while the state says the current section qualifies. Be careful applying the relevance rule precisely: `"*Bot"` ends with `"Bot"` but `"Bot*"` does not, and `"*"` must match the whole value, not merely contain it. ``` ```hint Dedup with order A plain `set` removes duplicates but loses order, which the output requires. You need a way to skip a URL you've already emitted while still keeping each URL's *first* appearance in order (e.g. a `seen` set paired with a result list, or `list(dict.fromkeys(...))`). Because matching is case-sensitive, do **not** lowercase anything before comparing or storing. ``` #### What a Strong Answer Covers - A relevance flag recomputed on **every** `User-agent:` line, so a non-bot section suppresses its `Disallow:` lines and the next qualifying section re-enables harvesting. - Precise relevance matching: exact `"*"` and exact `"Bot"` **suffix**, both case-sensitive; pre-first-section `Disallow:` ignored. - Order-preserving, case-sensitive dedup of the collected URLs. --- ### What a Strong Answer Covers These dimensions span all five parts. - **Correctness on the stated examples** for all five tasks, plus the non-obvious edge cases each task hides (empty/whitespace-only input, single element, all-odd list, coincident points, length-mismatched numbers, interleaved/irrelevant robots sections). - **The two decisive edge cases** that most often separate full from partial credit: Part 4 driving the loop on the *longer* of the two values (not the shorter), and Part 5 preserving first-occurrence order while staying case-sensitive. - **Linear-time, single-pass** implementations where natural, with clear variable names and no quadratic re-scans. - **Defensive handling** of language-specific gotchas (negative modulo, off-by-one in digit indexing, regex vs. manual tokenization for groups). - **Self-tests / dry-runs** on the provided examples to catch mistakes before submitting. ### Follow-up Questions - Part 1: How would you generalize "drama" so the appended punctuation and the replaced character are configurable, and how would Unicode whitespace (full-width spaces, non-breaking spaces) change your tokenizer? - Part 4: If `num1`, `num2`, and `statedSum` were given as decimal *strings* far exceeding 64-bit range, how would your approach change, and how would you handle a comparison in a non-decimal base? - Part 5: Real `robots.txt` allows `Allow:` directives, longest-match precedence, wildcards (`*`, `$`), and grouping where consecutive `User-agent` lines share one rule block. How would you extend the parser, and how would you decide whether a given URL is crawlable for a specific agent? - General: How would you structure these five functions and their tests so the suite runs fast and each failure points unambiguously at the responsible task?

Quick Answer: This multi-part problem evaluates programming fundamentals including string manipulation, list filtering and reversal, parsing, and simple geometric computations, with emphasis on correctness, edge-case handling, and implementation efficiency.

Related Interview Questions

  • Find Maximum Eastbound City Visits and Parse CSV - Upstart (medium)
  • Implement Byte Formatting and Cafeteria Billing - Upstart (medium)
  • Implement Three Assessment Functions - Upstart (medium)
  • Solve Reported OA Coding Problems - Upstart (medium)
  • Decode an anagram sentence using vocabulary constraints - Upstart (medium)
Upstart logo
Upstart
Apr 10, 2026, 12:00 AM
Software Engineer
Take-home Project
Coding & Algorithms
9
0

This is an online-assessment screen consisting of five independent coding tasks. Each task is self-contained — there is no shared state between them — and is graded against its own hidden test suite. Implement a function for every task below.

The tasks escalate in subtlety: the first three are warm-up string/list manipulations, while the last two (digit-comparison and robots-file parsing) hide most of the edge cases that decide whether you pass.

Constraints & Assumptions

  • Each task is graded independently; partial credit is awarded per task, so attempt all five even if one stumps you.
  • Inputs are well-formed for their type (Part 1 receives a string, Part 2/3 receive lists, Part 4 receives non-negative integers, Part 5 receives a list of strings). You do not need to validate types.
  • Part 3's input list is guaranteed non-empty. Coordinates may include negative or duplicate points.
  • Part 4's integers are non-negative and fit in a standard 64-bit integer; statedSum need not equal num1 + num2 (that is the whole point).
  • Part 5's lines may contain leading/trailing whitespace, blank lines, comment-like lines, and Disallow: / User-agent: directives in any interleaving. Prefix matching ( "User-agent:" , "Disallow:" ) is exact and case-sensitive on the directive name.
  • Aim for linear time in the size of each input; none of these tasks needs better than O(n)O(n)O(n) .

Clarifying Questions to Ask

  • General: Should every function return its result (not print it), and are there hidden tests for empty inputs (empty string, empty list, single line / single element)?
  • Part 1: Should a '.' in the middle (or at the start/end) of a group still be replaced before the trailing '!' is appended, and do tabs / newlines count as whitespace separators (not just the space character)?
  • Part 2: Is 0 treated as even, and may the input contain negative integers (which makes the language's modulo sign-convention relevant)?
  • Part 3: Is the required output [minX, minY, width, height] rather than [minX, minY, maxX, maxY] , and what are width / height when all points coincide?
  • Part 4: When correct and statedSum have different digit counts, do I pad the shorter conceptually with leading zeros and report every differing position — including high-order positions that exist only in the longer value?
  • Part 5: Must dedup preserve first-occurrence order across the whole file, and is matching truly case-sensitive for both the directive prefixes and the URL values?

Part 1 — Add drama to text

Given a string s, return a new string in which:

  • every period '.' is replaced with an exclamation mark '!' , and
  • one additional exclamation mark is appended after each group , where a group is a maximal contiguous run of non-whitespace characters.

Whitespace between groups must be preserved exactly (the kind and amount of whitespace is unchanged). Only the non-whitespace runs gain a trailing '!'.

Example: "hi. there" → "hi!! there!" (the first group "hi." becomes "hi!" after the period swap, then gains a trailing ! → "hi!!"; the second group "there" gains a trailing !).

What a Strong Answer Covers

  • Tokenization that preserves whitespace runs verbatim (kind and amount) and only mutates the non-whitespace groups.
  • Correct ordering of the two transforms: replace '.' → '!' across the whole group, then append exactly one trailing '!' .
  • Boundary behavior: empty string, all-whitespace string, leading/trailing/consecutive whitespace, and a lone "." .

Part 2 — Filter and reverse even numbers

Given a list of integers nums, remove every odd number and return the remaining even numbers in the reverse of their original relative order.

Example: [1, 2, 3, 4, 6, 5] → [6, 4, 2].

What a Strong Answer Covers

  • A correct parity test that survives negative inputs (compare % 2 == 0 , never == 1 , or use (x & 1) == 0 ) and treats 0 as even.
  • The output is reversed relative to the original even-element order, not the full-list order.
  • Empty / all-odd inputs return an empty list.

Part 3 — Compute a bounding rectangle

Given a non-empty list of coordinates, where each coordinate is a two-element list [x, y], return the four-element list [minX, minY, width, height] where:

  • minX is the minimum x over all points,
  • minY is the minimum y over all points,
  • width = maxX - minX ,
  • height = maxY - minY .

What a Strong Answer Covers

  • The output shape is [minX, minY, width, height] , where width/height are spans (differences), not the max coordinates.
  • A single-pass extrema scan with correct initialization (the non-empty guarantee makes the first point a safe seed).
  • Negative, duplicate, and coincident points (zero-area rectangle) handled correctly.

Part 4 — Find incorrect digit positions in a sum

Given three non-negative integers num1, num2, and statedSum:

  • Compute the correct value correct = num1 + num2 .
  • Compare statedSum against correct digit by digit, from the ones place upward : index 0 is the ones digit, index 1 the tens, index 2 the hundreds, and so on.
  • Return the list of zero-based indexes where the two digits differ (in ascending order).
  • The two values may have different digit lengths. Treat any missing higher-order digit (a position present in one value but past the most-significant digit of the other) as 0 , and keep comparing until every digit position of both correct and statedSum has been examined.

Example: num1 = 99, num2 = 1, statedSum = 190 → correct = 100. Comparing per position: index 0 → 0 vs 0 (match), index 1 → 0 vs 9 (differ), index 2 → 1 vs 1 (match). Result: [1].

What a Strong Answer Covers

  • Comparison anchored at the ones place (index 0), not at the most-significant digit — i.e. it does not string-zip from the left.
  • The loop runs to the longer of the two values, substituting 0 for the exhausted number's missing high-order digits.
  • The all-equal case returns [] , the all-zero case returns [] , and results come out in ascending index order.

Part 5 — Extract disallowed URLs for bot user agents

Given lines, a list of strings that together form the contents of a robots-style text file (one element per line), extract disallowed URLs as follows:

  • The file is divided into sections . A new section begins at any line that starts with the exact prefix "User-agent:" . The remainder of that line (after the prefix), trimmed of surrounding spaces, is the section's user-agent value .
  • A section is relevant if its user-agent value either is exactly "*" or ends with the exact suffix "Bot" .
  • Inside a relevant section, for each line that starts with the exact prefix "Disallow:" , collect the URL value — the remainder after the prefix, trimmed of surrounding spaces.
  • A section's scope ends when the next "User-agent:" line begins or the file ends. Lines belonging to a non-relevant section are ignored, as is any "Disallow:" that appears before the first "User-agent:" line.
  • Return all collected URLs with duplicates removed, preserving the order of first occurrence . Matching is case-sensitive , so "/A" and "/a" are distinct URLs.

What a Strong Answer Covers

  • A relevance flag recomputed on every User-agent: line, so a non-bot section suppresses its Disallow: lines and the next qualifying section re-enables harvesting.
  • Precise relevance matching: exact "*" and exact "Bot" suffix , both case-sensitive; pre-first-section Disallow: ignored.
  • Order-preserving, case-sensitive dedup of the collected URLs.

What a Strong Answer Covers

These dimensions span all five parts.

  • Correctness on the stated examples for all five tasks, plus the non-obvious edge cases each task hides (empty/whitespace-only input, single element, all-odd list, coincident points, length-mismatched numbers, interleaved/irrelevant robots sections).
  • The two decisive edge cases that most often separate full from partial credit: Part 4 driving the loop on the longer of the two values (not the shorter), and Part 5 preserving first-occurrence order while staying case-sensitive.
  • Linear-time, single-pass implementations where natural, with clear variable names and no quadratic re-scans.
  • Defensive handling of language-specific gotchas (negative modulo, off-by-one in digit indexing, regex vs. manual tokenization for groups).
  • Self-tests / dry-runs on the provided examples to catch mistakes before submitting.

Follow-up Questions

  • Part 1: How would you generalize "drama" so the appended punctuation and the replaced character are configurable, and how would Unicode whitespace (full-width spaces, non-breaking spaces) change your tokenizer?
  • Part 4: If num1 , num2 , and statedSum were given as decimal strings far exceeding 64-bit range, how would your approach change, and how would you handle a comparison in a non-decimal base?
  • Part 5: Real robots.txt allows Allow: directives, longest-match precedence, wildcards ( * , $ ), and grouping where consecutive User-agent lines share one rule block. How would you extend the parser, and how would you decide whether a given URL is crawlable for a specific agent?
  • General: How would you structure these five functions and their tests so the suite runs fast and each failure points unambiguously at the responsible task?

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Upstart•More Software Engineer•Upstart Software Engineer•Upstart Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.