PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Data Manipulation (SQL/Python)/EY

Architect cloud data ingestion patterns

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability to design end-to-end cloud data ingestion and serving architectures, including streaming and batch patterns, CDC/event sourcing/micro-batch trade-offs, partitioning and compaction strategies, idempotency, schema evolution, PII tokenization, cost controls, and incident simulation for chaos testing.

  • Medium
  • EY
  • Data Manipulation (SQL/Python)
  • Data Scientist

Architect cloud data ingestion patterns

Company: EY

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Propose a cloud data ingestion and serving pattern for streaming and batch on your preferred cloud (AWS/Azure/GCP). Choose between CDC, event sourcing, or micro‑batch for upstream systems, justify partitioning/compaction, and show how you ensure idempotency, schema evolution (e.g., optional fields), and PII tokenization. Include cost controls (storage tiering, TTL, file size targets) and an incident you would simulate in chaos testing.

Quick Answer: This question evaluates a candidate's ability to design end-to-end cloud data ingestion and serving architectures, including streaming and batch patterns, CDC/event sourcing/micro-batch trade-offs, partitioning and compaction strategies, idempotency, schema evolution, PII tokenization, cost controls, and incident simulation for chaos testing.

Related Interview Questions

  • Design logical model and consumption - EY (Medium)
  • Map sources to functional dataset with SQL - EY (Medium)
  • Design a data platform enablement - EY (Medium)
EY logo
EY
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Data Manipulation (SQL/Python)
3
0

Propose a cloud data ingestion and serving pattern for streaming and batch on your preferred cloud (AWS/Azure/GCP). Choose between CDC, event sourcing, or micro‑batch for upstream systems, justify partitioning/compaction, and show how you ensure idempotency, schema evolution (e.g., optional fields), and PII tokenization. Include cost controls (storage tiering, TTL, file size targets) and an incident you would simulate in chaos testing.

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More EY•More Data Scientist•EY Data Scientist•EY Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.