PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates competency in SQL-based data manipulation and data engineering concepts, including handling late-arriving data and deduplication to the latest ingested record, joining and filtering for enrichment and KYC status, and computing exposure metrics such as gross/net notional, limit utilization, and breach flags.

  • Medium
  • EY
  • Data Manipulation (SQL/Python)
  • Data Scientist

Map sources to functional dataset with SQL

Company: EY

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

You must produce a functional, consumption‑ready dataset for daily exposure monitoring. Assume “today” = 2025‑09‑01. Use the last 7 calendar days by trade_dt (inclusive), handle late‑arriving data, and deduplicate to the latest ingested record per trade_id. Sample source tables (ASCII): Customers +---------+-----------+------------+---------+ | cust_id | cust_name | kyc_status | country | +---------+-----------+------------+---------+ | 101 | Alpha LLC | PASS | US | | 102 | Beta SA | PASS | FR | | 103 | Gamma AG | REVIEW | DE | +---------+-----------+------------+---------+ Accounts +---------+---------+---------+---------------------+ | acct_id | cust_id | product | opened_at | +---------+---------+---------+---------------------+ | 5001 | 101 | MARGIN | 2023-05-10 09:00:00 | | 5002 | 102 | CASH | 2024-11-01 10:00:00 | +---------+---------+---------+---------------------+ Trades +----------+---------+------------+-------------+---------------+------+----------+---------------------+ | trade_id | acct_id | trade_dt | asset_class | notional_usd | side | status | ingested_at | +----------+---------+------------+-------------+---------------+------+----------+---------------------+ | T1 | 5001 | 2025-08-26 | EQ | 1,000,000 | BUY | BOOKED | 2025-08-26 12:00:00 | | T1 | 5001 | 2025-08-26 | EQ | 1,000,000 | BUY | CANCELED | 2025-08-27 08:00:00 | | T2 | 5001 | 2025-08-28 | FI | 2,500,000 | SELL | BOOKED | 2025-08-28 11:30:00 | | T3 | 5002 | 2025-08-30 | EQ | 750,000 | BUY | BOOKED | 2025-09-01 02:00:00 | +----------+---------+------------+-------------+---------------+------+----------+---------------------+ RiskLimits +---------+----------------+ | acct_id | daily_limit_usd| +---------+----------------+ | 5001 | 2,000,000 | | 5002 | 1,000,000 | +---------+----------------+ Task: Write SQL to build a DailyExposure fact at grain (acct_id, trade_dt) over [2025‑08‑25, 2025‑09‑01]. Requirements: - Deduplicate to the latest ingested row per (trade_id) before aggregation. - Compute gross_notional (sum abs(notional_usd)), net_notional (BUY positive, SELL negative), limit_utilization = gross_notional / daily_limit_usd, and breach_flag (limit_utilization > 1.0). - Exclude trades with status = 'CANCELED'. - Include only accounts with Customers.kyc_status = 'PASS'. - Make the query idempotent for daily backfills (no double counting on re‑runs). Provide the final SELECT and explain one edge case your SQL intentionally ignores.

Quick Answer: This question evaluates competency in SQL-based data manipulation and data engineering concepts, including handling late-arriving data and deduplication to the latest ingested record, joining and filtering for enrichment and KYC status, and computing exposure metrics such as gross/net notional, limit utilization, and breach flags.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Design logical model and consumption - EY (Medium)
  • Architect cloud data ingestion patterns - EY (Medium)
  • Design a data platform enablement - EY (Medium)