PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Data Manipulation (SQL/Python)/Citadel

Implement left join on Python lists, no packages

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability to implement SQL-style left join semantics using pure Python lists and dictionaries, testing skills in hashing, handling duplicate and missing keys, and reasoning about algorithmic time and space complexity.

  • Medium
  • Citadel
  • Data Manipulation (SQL/Python)
  • Data Scientist

Implement left join on Python lists, no packages

Company: Citadel

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Implement a left join in pure Python (no external packages, no pandas). Input: left = list of dicts with key 'id' and arbitrary other fields; right = list of dicts with key 'id' and fields to append (disjoint names from left). Requirements: (1) Preserve the original order of 'left' and left duplicates. (2) Support one-to-many matches on 'right' (i.e., duplicate 'id's): emit one output row per matching right row; if no match, emit a single row with right fields set to None. (3) Time O(n + m) and extra space O(n + m) by using hashing; explain how you would reduce memory when m is huge (e.g., streaming or external sort). (4) Handle missing 'id' keys robustly. Provide clear function signatures and tests on small examples.

Quick Answer: This question evaluates a candidate's ability to implement SQL-style left join semantics using pure Python lists and dictionaries, testing skills in hashing, handling duplicate and missing keys, and reasoning about algorithmic time and space complexity.

Related Interview Questions

  • Perform EDA and diagnose data quality - Citadel (Medium)
  • Implement Left Join Using Python Dictionaries Efficiently - Citadel (Medium)
Citadel logo
Citadel
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Data Manipulation (SQL/Python)
5
0

Implement a left join in pure Python (no external packages, no pandas). Input: left = list of dicts with key 'id' and arbitrary other fields; right = list of dicts with key 'id' and fields to append (disjoint names from left). Requirements: (1) Preserve the original order of 'left' and left duplicates. (2) Support one-to-many matches on 'right' (i.e., duplicate 'id's): emit one output row per matching right row; if no match, emit a single row with right fields set to None. (3) Time O(n + m) and extra space O(n + m) by using hashing; explain how you would reduce memory when m is huge (e.g., streaming or external sort). (4) Handle missing 'id' keys robustly. Provide clear function signatures and tests on small examples.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Citadel•More Data Scientist•Citadel Data Scientist•Citadel Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.