Implement string-based big integer multiplication
Company: Pinterest
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
##### Question
Implement `multiplyStrings(a, b)` to multiply two non-negative integers given as decimal strings and return their product as a decimal string. You may **not** convert the full inputs to a built-in big-integer or fixed-width integer type, nor use any arbitrary-precision arithmetic library.
Your solution must:
1. Correctly multiply two non-negative integers represented as decimal strings and return the product as a string.
2. Handle edge cases: `"0"` (and any product that is zero), `"1"`, inputs of very different lengths, leading zeros in the inputs, and many trailing zeros. The result must not carry spurious leading zeros (e.g. `"00"` should never be returned).
3. Scale efficiently to very large inputs (e.g. strings up to a few thousand digits, or up to ~10^5 total digits) while managing carries correctly and avoiding quadratic *memory* blowups.
4. Discuss the algorithmic options and their trade-offs: grade-school long multiplication (`O(n·m)` time), Karatsuba (`O(n^1.585)`), and FFT/NTT-based convolution (`O(n log n)`). Implement at least one approach.
5. State the time and space complexity of your implementation.
6. Explain how you would test the solution, enumerating the edge cases above.
Quick Answer: Implement multiplyStrings(a, b) to multiply two non-negative integers given as decimal strings, returning the product as a string without any big-integer or arbitrary-precision library. The solution covers grade-school O(n*m) long multiplication with linear memory, the Karatsuba and FFT/NTT scaling alternatives, correct carry and leading/trailing-zero handling, and a full edge-case test plan.
Solution
## Approach
Mimic grade-school long multiplication on the digit arrays. The key insight is that digit `i` of `a` (counting from the right) times digit `j` of `b` contributes to position `i + j` of the result, with the carry flowing into position `i + j + 1`. If `a` has `n` digits and `b` has `m` digits, the product has at most `n + m` digits, so a result buffer of size `n + m` suffices.
## Grade-school O(n·m) implementation (Python)
```python
def multiplyStrings(a: str, b: str) -> str:
# Fast path / normalization: a zero operand => product is "0".
if a == "0" or b == "0":
return "0"
n, m = len(a), len(b)
# result[k] will hold the (possibly >9) accumulated value at position k
# before a final carry-normalization pass. Index 0 = least significant.
result = [0] * (n + m)
# Iterate from the least significant digit of each operand.
for i in range(n - 1, -1, -1):
di = ord(a[i]) - 48 # ord('0') == 48
pos = n - 1 - i # how far from the right a[i] sits
for j in range(m - 1, -1, -1):
dj = ord(b[j]) - 48
result[pos + (m - 1 - j)] += di * dj
# Single carry-propagation pass, least-significant to most-significant.
carry = 0
for k in range(n + m):
cur = result[k] + carry
result[k] = cur % 10
carry = cur // 10
# carry is guaranteed 0 here because the buffer is wide enough.
# Build the string from most-significant digit, skipping leading zeros.
out = []
started = False
for k in range(n + m - 1, -1, -1):
if result[k] != 0:
started = True
if started:
out.append(chr(result[k] + 48))
return "".join(out) if out else "0"
```
### Why this is correct
- Each partial product `di * dj` (at most `9*9 = 81`) is accumulated into `result[i+j]`; deferring carry propagation to a single final pass avoids interleaving carry logic with the double loop and keeps the inner loop branch-free.
- The buffer length `n + m` is the maximum possible product length, so the final carry cannot overflow it.
- The leading-zero trim handles results like a clean power of ten and the all-zero case. Leading zeros *in the inputs* don't need special handling: they simply contribute `0 * dj = 0` and fall out naturally (though stripping input leading zeros up front is a fine optional optimization).
## Complexity
- **Time:** `O(n·m)` — the dominant double loop. The carry pass and string build are `O(n + m)`.
- **Space:** `O(n + m)` for the result buffer — linear, not quadratic. This satisfies the "avoid quadratic memory blowup" requirement.
## Scaling beyond grade-school
For inputs up to a few thousand digits, the `O(n·m)` algorithm is fast enough. As sizes grow toward 10^5+ digits, switch algorithms by trade-off:
- **Karatsuba** — recursively splits each number into high/low halves and replaces 4 sub-multiplications with 3 (`x·y = z2·B^2 + z1·B + z0` where `z1 = (x_hi+x_lo)(y_hi+y_lo) - z2 - z0`). Time `O(n^log2(3)) ≈ O(n^1.585)`. Simple to implement on chunked base-10^k limbs; good for the "large but not enormous" range.
- **FFT / NTT convolution** — treat the digit arrays as polynomials (digit = coefficient), compute their product via Fast Fourier Transform (or a Number-Theoretic Transform to stay exact in modular arithmetic), then carry-normalize. Time `O(n log n)`, the asymptotically best practical option for very large operands. Watch floating-point precision with FFT; NTT avoids it entirely.
A production implementation typically uses grade-school below a threshold (e.g. a few hundred limbs), Karatsuba above it, and FFT/NTT for the largest inputs.
## Testing strategy
- **Zeros:** `"0" * "123" == "0"`, `"0" * "0" == "0"`.
- **Identity:** `"1" * "99999" == "99999"`, `"99999" * "1" == "99999"`.
- **Different lengths:** `"9" * "99999999"`, `"123" * "4567"`.
- **Leading zeros in input:** `"007" * "08" == "56"` (no spurious leading zeros in output).
- **Trailing zeros:** `"100" * "100" == "10000"`, `"50" * "40" == "2000"`.
- **Single carry chain:** `"99" * "99" == "9801"`, `"999...9" (k nines) squared` to stress carries.
- **Cross-check:** for random inputs up to ~18 digits, compare against `str(int(a) * int(b))` (used only in the test, not the solution).
- **Scale/performance:** multiply two `100000`-digit strings and assert it completes within the time budget without `O(n^2)` memory.
Explanation
Grade-school long multiplication on digit arrays: accumulate every digit-pair product di*dj into result[i+j], defer carry propagation to one final left-to-right pass over an (n+m)-length buffer, then trim leading zeros. Time O(n*m), space O(n+m) — linear memory, no big-integer types used. Karatsuba (O(n^1.585)) and FFT/NTT convolution (O(n log n)) are the scaling upgrades for very large operands. Tests cover zero/identity operands, mismatched lengths, leading and trailing zeros, carry chains, and a large-input performance check.