Implement expression expansion to plus-only form
Company: WeRide
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Given a string expression consisting only of lowercase letters (variables), '+', '*', '(', and ')', return an equivalent expression that uses only '+' by fully distributing multiplication over addition. Concatenation denotes multiplication and there are no spaces. Maintain variable order within each product according to their left-to-right appearance, and output terms in the natural left-to-right expansion order. Part 1 (no '+' outside parentheses): Expand expressions where '+' appears only inside parentheses. Examples: (a+b)*(c+d) -> ac+ad+bc+bd; a*(b+c)*(d+e+f)*g -> abdg+abeg+abfg+acdg+aceg+acfg. Implement an O(n) time solution using O(
1) extra space beyond the output (no stack), and explain why it works. Part 2 (general case): Expand expressions that may also contain '+' outside parentheses. Example: a+b*c*(d+e+f)+k+m*(g+h)+i -> a+bcd+bce+bcf+k+mg+mh+i. Implement working code that handles nested parentheses using an appropriate data structure (e.g., a stack) to track the current multiplication result segments, targeting O(n) time and O(n) extra space beyond the output. Analyze time and space complexities and discuss key edge cases.
Quick Answer: This question evaluates understanding of expression parsing and algebraic expansion via the distributive property, along with skills in string manipulation, stack-managed state for nested parentheses, and time/space complexity analysis.
Expand Expression to Plus-Only Form (No '+' Outside Parentheses)
You are given a string `expr` consisting only of lowercase letters (each letter is a variable), `'+'`, `'*'`, `'('`, and `')'`. Concatenation denotes multiplication (there are no spaces). Return an equivalent expression that uses only `'+'`, obtained by fully distributing multiplication over addition.
**Part 1 constraint:** `'+'` appears only *inside* parentheses (never at the top level outside any parentheses). So the whole expression is a single product of factors, where each factor is either a variable or a parenthesized sub-expression.
Within each output term, keep the variables in their left-to-right order of appearance, and emit the terms in the natural left-to-right expansion order.
Examples:
- `(a+b)*(c+d)` -> `ac+ad+bc+bd`
- `a*(b+c)*(d+e+f)*g` -> `abdg+abeg+abfg+acdg+aceg+acfg`
Key insight (why O(1) extra state suffices here): because there is no top-level `'+'`, the answer is one running product. Maintain a single list of partial terms (the current multiplication result). Each time you finish reading a factor (a variable or a parenthesized group of alternatives), replace the running result with the cartesian product `[a + b for a in result for b in factor]`. No stack of separate summands is needed because the top level never branches into a sum.
Implement a method `expand(expr)` that returns the fully-distributed, plus-only string.
Constraints
- 1 <= len(expr)
- expr contains only lowercase letters 'a'-'z', '+', '*', '(', ')'
- '+' only ever appears inside parentheses (Part 1)
- Parentheses are balanced; concatenation (or '*') denotes multiplication
- Variable order within a product follows left-to-right appearance
Examples
Input: ("(a+b)*(c+d)",)
Expected Output: "ac+ad+bc+bd"
Explanation: Distribute (a+b) over (c+d): a*c, a*d, b*c, b*d.
Input: ("a*(b+c)*(d+e+f)*g",)
Expected Output: "abdg+abeg+abfg+acdg+aceg+acfg"
Explanation: Cross-product of {a} x {b,c} x {d,e,f} x {g}, variables kept in left-to-right order; 1*2*3*1 = 6 terms.
Input: ("(a+b)*c",)
Expected Output: "ac+bc"
Explanation: A sum group multiplied by a lone variable.
Input: ("a*b*c",)
Expected Output: "abc"
Explanation: Pure product, no addition: a single term.
Input: ("a",)
Expected Output: "a"
Explanation: Edge case: a single variable returns itself.
Input: ("((a+b)*c)*(d+e)",)
Expected Output: "acd+ace+bcd+bce"
Explanation: Nested parentheses: inner ((a+b)*c) -> {ac, bc}, then crossed with {d, e}.
Hints
- Each factor is either a single variable -> ["x"], or a parenthesized group -> a list of alternative sub-terms.
- Multiplying two factors is the cartesian product: for every left term and every right term, concatenate them, preserving left-then-right order.
- Keep one running list of partial terms. Read factor after factor and replace the running list with its cross-product against the new factor. No stack is needed because the top level is a single product (no '+' outside parentheses).
Expand Expression to Plus-Only Form (General Case, '+' Allowed Outside Parentheses)
Follow-up / general case. You are given a string `expr` of lowercase letters (variables), `'+'`, `'*'`, `'('`, and `')'`. Concatenation denotes multiplication (no spaces). Return an equivalent expression using only `'+'`, obtained by fully distributing multiplication over addition.
Unlike Part 1, `'+'` may now appear **outside** parentheses as well, and parentheses may be **nested**. So the top level is itself a sum of products, and a product may contain parenthesized sub-sums to any depth.
Within each output term keep variables in left-to-right order of appearance, and emit terms in the natural left-to-right expansion order.
Example:
- `a+b*c*(d+e+f)+k+m*(g+h)+i` -> `a+bcd+bce+bcf+k+mg+mh+i`
Approach (why a stack is needed now): because `'+'` can appear at the top level (and at every parenthesis depth), the expression is a tree of sums and products. Use a recursive descent parser (which is exactly a stack via the call stack, or an explicit stack of "current multiplication-result segments"): `parse_sum` reads products separated by `'+'` and concatenates their term-lists; `parse_product` multiplies factors via cartesian product; `parse_factor` reads either a variable or a parenthesized `parse_sum`. On `'('` you push a new sub-context; on `')'` you pop and fold the inner sum back in as one factor. Each character is visited once.
Implement a method `expand(expr)` returning the fully-distributed, plus-only string. This is a strict generalization of Part 1 — it also handles every Part 1 input.
Constraints
- 1 <= len(expr)
- expr contains only lowercase letters 'a'-'z', '+', '*', '(', ')'
- '+' may appear inside OR outside parentheses
- Parentheses are balanced and may be nested to arbitrary depth
- Concatenation (or '*') denotes multiplication; variable order within a product follows left-to-right appearance
Examples
Input: ("a+b*c*(d+e+f)+k+m*(g+h)+i",)
Expected Output: "a+bcd+bce+bcf+k+mg+mh+i"
Explanation: Top-level sum: a stays; b*c*(d+e+f) -> bcd,bce,bcf; k stays; m*(g+h) -> mg,mh; i stays.
Input: ("a+b",)
Expected Output: "a+b"
Explanation: Top-level '+' outside any parentheses: two single-variable terms.
Input: ("a+(b+c)*d",)
Expected Output: "a+bd+cd"
Explanation: Mix of a bare top-level term and a distributed product (b+c)*d.
Input: ("(a+b)*(c+(d+e)*f)",)
Expected Output: "ac+adf+aef+bc+bdf+bef"
Explanation: Nested parentheses: inner (c+(d+e)*f) -> {c, df, ef}, then crossed with {a, b}.
Input: ("a",)
Expected Output: "a"
Explanation: Edge case: single variable.
Input: ("x+y+z",)
Expected Output: "x+y+z"
Explanation: Pure top-level sum, no multiplication: unchanged.
Input: ("a*(b+c)+d*(e+f)",)
Expected Output: "ab+ac+de+df"
Explanation: Two products joined by a top-level '+', each distributed independently.
Hints
- Top level is now a sum of products. parse_sum() reads a product, then while the next char is '+', reads another product and appends its term-list (sums concatenate term-lists).
- parse_product() multiplies factors: start with the first factor's term-list and replace it with the cartesian product against each following factor (skip an optional '*' between factors).
- parse_factor() handles '(' by recursing into parse_sum() (this recursion IS the stack tracking nested multiplication-result segments) and consuming the matching ')'; otherwise it reads one variable. Each character is consumed once.