Consider an N-ary tree where each node runs on a separate machine and represents a distributed component. Each node knows its parent (if any) and its list of children. Implement in the Node class a method call(Node from, Message message) that, using asynchronous message passing, ultimately returns the total number of nodes in the tree to the initiating node. The protocol must use exactly two message types (REQUEST_COUNT and REPLY_COUNT), avoid double counting, protect any critical section, and ensure each node contributes exactly once. Specify: (a) the message format and fields; (b) the per-node state you maintain; (c) how a node propagates requests to children and aggregates replies; (d) how you prevent duplicate propagation when a node receives multiple requests; (e) how termination is detected at the initiator so it can return the global count; and (f) assumptions about concurrency and delivery (e.g., out-of-order messages, at-most-once vs. at-least-once) and the time/message complexity of your approach.

Design a distributed tree node counter evaluates requirements, scale assumptions, API/data design, architecture, trade-offs, failure modes, and rollout in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

How do I approach System Design interview questions?

System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master system design interviews.

What difficulty level is this interview question?

This is a medium difficulty System Design question, commonly asked during Technical Screen rounds at Snowflake.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Snowflake during technical interviews.

Design a distributed tree node counter | Snowflake Interview Question

Design a distributed tree node counter

Distributed Tree Node Count with Two Messages

You are given an N-ary tree of machines (one process per node). Each node knows its parent (if any) and its list of children. Nodes communicate via asynchronous message passing.

Goal: Design the per-node logic so that when any node initiates a count, exactly two message types (REQUEST_COUNT and REPLY_COUNT) are used to compute the total number of nodes in the entire tree and return it to the initiator. Each node must contribute exactly once; no double counting; protect critical sections.

Assume the runtime invokes call(Node from, Message message) on a node upon message arrival, and the initiator can trigger the protocol locally (e.g., startCount()).

Specify:

(a) Message format and fields.

(b) Per-node state you maintain.

(d) How you prevent duplicate propagation when a node receives multiple requests for the same traversal.

(e) How the initiator detects termination and returns the global count.

(f) Assumptions about concurrency and delivery (e.g., out-of-order messages, at-most-once vs. at-least-once) and the time/message complexity of your approach.

Constraints & Assumptions

Preserve the scope, facts, inputs, and requested outputs from the prompt above.
If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.

Clarifying Questions to Ask

Clarify users, core use cases, read/write patterns, scale, latency, availability, and data retention.
State explicit assumptions before making sizing or architecture decisions.
Prioritize the functional path first, then address reliability, security, observability, and rollout.

What a Strong Answer Covers

A scoped requirements summary with concrete non-goals and success metrics.
API, data model, architecture, consistency, capacity, and operations.
Reasoned trade-offs among simple and scalable designs, including bottlenecks and failure modes.
A validation, monitoring, migration, and launch plan appropriate for the risk level.

Follow-up Questions

What breaks first at 10x traffic or data volume?
How would you degrade gracefully during dependency failures?
What metrics and alerts would prove the design is healthy after launch?

Design a distributed tree node counter

Distributed Tree Node Count with Two Messages

You are given an N-ary tree of machines (one process per node). Each node knows its parent (if any) and its list of children. Nodes communicate via asynchronous message passing.

Assume the runtime invokes call(Node from, Message message) on a node upon message arrival, and the initiator can trigger the protocol locally (e.g., startCount()).

Specify:

(a) Message format and fields.

(b) Per-node state you maintain.

(d) How you prevent duplicate propagation when a node receives multiple requests for the same traversal.

(e) How the initiator detects termination and returns the global count.

(f) Assumptions about concurrency and delivery (e.g., out-of-order messages, at-most-once vs. at-least-once) and the time/message complexity of your approach.

Constraints & Assumptions

Preserve the scope, facts, inputs, and requested outputs from the prompt above.
If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.

Clarifying Questions to Ask

Clarify users, core use cases, read/write patterns, scale, latency, availability, and data retention.
State explicit assumptions before making sizing or architecture decisions.
Prioritize the functional path first, then address reliability, security, observability, and rollout.

What a Strong Answer Covers

A scoped requirements summary with concrete non-goals and success metrics.
API, data model, architecture, consistency, capacity, and operations.
Reasoned trade-offs among simple and scalable designs, including bottlenecks and failure modes.
A validation, monitoring, migration, and launch plan appropriate for the risk level.

Follow-up Questions

What breaks first at 10x traffic or data volume?
How would you degrade gracefully during dependency failures?
What metrics and alerts would prove the design is healthy after launch?

Design a distributed tree node counter

Quick Overview

Design a distributed tree node counter

Design a distributed tree node counter

Distributed Tree Node Count with Two Messages

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers

Follow-up Questions

Submit Your Answer to Earn 20XP

Design a distributed tree node counter

Quick Overview

Design a distributed tree node counter

Design a distributed tree node counter

Distributed Tree Node Count with Two Messages

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers

Follow-up Questions

Submit Your Answer to Earn 20XP