SQL/Python Data Manipulation And Joins
Asked of: Software Engineer
Last updated

What's being tested
These exercises test data-shaping code: parsing semi-structured inputs, validating records, joining datasets, aggregating nested values, and producing deterministic output. Stripe interviewers are probing whether you can turn messy business data into correct, maintainable Python/SQL-style transformations with clear edge-case handling.
Patterns & templates
-
Hash join for CSV-style joins — build
dict[key] -> list[rows], then emit left rows with all matches;O(n + m + output)time. -
Stable sorting for deterministic output — use
sorted(rows, key=lambda r: (...)); explicitly define tie-breakers instead of relying on incidental input order. -
Validation pipeline — parse, normalize, validate, compute, serialize; keep errors structured as
{field, message, path}and filterNone/ empty messages. -
Nested aggregation — flatten recursive lists/dicts with DFS or stack; preserve sequence order when serializing validation failures.
-
Timezone-aware schedule expansion — use
datetime,zoneinfo, and locale formatting; distinguish recurring-rule expansion from notification rendering. -
String/bitmap rendering — preallocate row buffers or collect chars in lists, then
''.join(...); avoid repeated string concatenation in loops. -
Numeric robustness — use
Decimalfor money-like costs, validate missing/negative/invalid quantities, and separate rounding policy from computation.
Common pitfalls
Pitfall: Treating joins as one-to-one when keys can have multiple matches; output cardinality may exceed either input size.
Pitfall: Mixing parsing, validation, and business logic in one loop; this makes edge cases hard to test and debug.
Pitfall: Producing nondeterministic output because
dictiteration, duplicate keys, or unsorted errors are not explicitly ordered.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Featured in interview prep guides
Practice questions
- Debug Validation Error AggregationStripe · Software Engineer · Onsite · hard
- Implement a CSV dataset joinStripe · Software Engineer · Take-home Project · medium
- Generate user notifications from schedulesStripe · Software Engineer · Technical Screen · Medium
- Compute costs with validation and sorting in PythonStripe · Software Engineer · Technical Screen · Medium
- Design payment-to-invoice matcher with prioritiesStripe · Software Engineer · Technical Screen · Medium
- Convert bitmap into ASCII charactersStripe · Software Engineer · Onsite · Medium
Related concepts
- SQL And Python Data ManipulationData Manipulation (SQL/Python)
- SQL/Python Joins, Aggregations, And Window FunctionsData Manipulation (SQL/Python)
- Python/Pandas Data ManipulationData Manipulation (SQL/Python)
- Pandas Data ManipulationData Manipulation (SQL/Python)
- SQL Analytical Querying And Data ModelingData Manipulation (SQL/Python)
- SQL Window Functions And Analytical QueryingData Manipulation (SQL/Python)