PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/Nextdoor

Design KPI tables, retention, rollups, timezone

Last updated: Mar 29, 2026

Quick Overview

This question evaluates data engineering and analytical-system design competencies, including derived table schema design, ETL and incremental load strategies, time-series rollups, cohort/retention analysis, aggregation correctness, and timezone-aware event handling, and it sits in the System Design / Data Engineering domain.

  • hard
  • Nextdoor
  • System Design
  • Data Engineer

Design KPI tables, retention, rollups, timezone

Company: Nextdoor

Role: Data Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

You are building a fast KPI dashboard for **Nextagram**. The raw tables are: - **users**(user_id PK, username, created_at TIMESTAMP) - **photos**(photo_id PK, user_id, created_at TIMESTAMP) - **follows**(follower_user_id, followed_user_id, created_at TIMESTAMP) Assume raw timestamps are stored in **PST** initially. The dashboard is reported **daily** (one row per report date). Define the daily KPIs as: - **new_users**: users created on that report date - **new_users_change_vs_7d_ago**: new_users(today) − new_users(today − 7 days) - **DAU**: distinct users who performed at least one “active” action that day (define “active” as: posted a photo OR followed someone) - **DAU_change_vs_7d_ago** - **new_photos**: photos created on that report date - **new_photos_per_DAU**: new_photos / DAU - **new_photos_change_vs_7d_ago** - **followers**: total number of follow relationships as-of end of that report date (i.e., cumulative count of rows in follows with created_at <= end_of_day) - **%_users_with_0_followers**: percent of users whose follower count (inbound follows) is 0 as-of end of that date - **%_users_with_1_2_followers** - **%_users_with_3plus_followers** Tasks: **A) Data model / table design** - Design the derived table(s) that power this dashboard for a huge dataset (fast reads). - Specify table schemas (columns, grain, and keys), and what is precomputed. - List key considerations (partitioning/clustering, incremental loads, idempotency, late-arriving data, correctness vs cost, etc.). **B) Daily population strategy** - Describe how you would populate/refresh these tables every day from the raw tables (high-level ETL steps; SQL/pseudocode is acceptable). **C) Add 1-week retention** - Add a **1-week retention** metric to the dashboard: for each cohort day D (users who joined on D), what fraction are active on D+7 (using the same “active” definition as DAU). - Describe required table changes and how you would compute it efficiently. **D) Switch to weekly rollups** - The dashboard should support **weekly** rollups instead of daily. Describe what changes in the table design and computation (date grain, definitions of “change vs 7d ago”, handling partial weeks, etc.). **E) Report in a different timezone** - A New York office wants metrics reported in **EST** instead of PST. Describe what must change (timestamp storage, date boundaries, ETL windows, backfills, and how to avoid double-counting at day boundaries).

Quick Answer: This question evaluates data engineering and analytical-system design competencies, including derived table schema design, ETL and incremental load strategies, time-series rollups, cohort/retention analysis, aggregation correctness, and timezone-aware event handling, and it sits in the System Design / Data Engineering domain.

Related Interview Questions

  • Design a Scalable Job Scheduler - Nextdoor (medium)
  • Design tables for KPI dashboard - Nextdoor (hard)
  • Describe CAP, distributed systems, multithreading experience - Nextdoor (hard)
Nextdoor logo
Nextdoor
Feb 8, 2026, 12:00 AM
Data Engineer
Technical Screen
System Design
3
0

You are building a fast KPI dashboard for Nextagram. The raw tables are:

  • users (user_id PK, username, created_at TIMESTAMP)
  • photos (photo_id PK, user_id, created_at TIMESTAMP)
  • follows (follower_user_id, followed_user_id, created_at TIMESTAMP)

Assume raw timestamps are stored in PST initially.

The dashboard is reported daily (one row per report date). Define the daily KPIs as:

  • new_users : users created on that report date
  • new_users_change_vs_7d_ago : new_users(today) − new_users(today − 7 days)
  • DAU : distinct users who performed at least one “active” action that day (define “active” as: posted a photo OR followed someone)
  • DAU_change_vs_7d_ago
  • new_photos : photos created on that report date
  • new_photos_per_DAU : new_photos / DAU
  • new_photos_change_vs_7d_ago
  • followers : total number of follow relationships as-of end of that report date (i.e., cumulative count of rows in follows with created_at <= end_of_day)
  • %_users_with_0_followers : percent of users whose follower count (inbound follows) is 0 as-of end of that date
  • %_users_with_1_2_followers
  • %_users_with_3plus_followers

Tasks:

A) Data model / table design

  • Design the derived table(s) that power this dashboard for a huge dataset (fast reads).
  • Specify table schemas (columns, grain, and keys), and what is precomputed.
  • List key considerations (partitioning/clustering, incremental loads, idempotency, late-arriving data, correctness vs cost, etc.).

B) Daily population strategy

  • Describe how you would populate/refresh these tables every day from the raw tables (high-level ETL steps; SQL/pseudocode is acceptable).

C) Add 1-week retention

  • Add a 1-week retention metric to the dashboard: for each cohort day D (users who joined on D), what fraction are active on D+7 (using the same “active” definition as DAU).
  • Describe required table changes and how you would compute it efficiently.

D) Switch to weekly rollups

  • The dashboard should support weekly rollups instead of daily. Describe what changes in the table design and computation (date grain, definitions of “change vs 7d ago”, handling partial weeks, etc.).

E) Report in a different timezone

  • A New York office wants metrics reported in EST instead of PST. Describe what must change (timestamp storage, date boundaries, ETL windows, backfills, and how to avoid double-counting at day boundaries).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Nextdoor•More Data Engineer•Nextdoor Data Engineer•Nextdoor System Design•Data Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.