##### Question Design a weather data platform that retrieves best-effort hourly dumps from an external Weather Service API for approximately 2,000 locations across the U.S. and exposes an API to return the **current temperature** for a requested location. **Hard requirement:** data served must be at most **10 minutes stale** relative to the provider's hourly publish time. The system should: 1. Define a **high-level architecture**: ingestion, storage, cache, read API, and scheduler/orchestrator. 2. Design the **API contract**: e.g. `GET /temperature?locationId=...` (and optionally `lat,lon`), the response schema, error cases, freshness metadata, and versioning. Support lookup by internal `id` and by nearest covered location for `lat,lon`. 3. Choose a **data model and storage** (including the location schema, TTL/retention, indexing, and partitioning strategy). 4. Specify the **ingestion pipeline**: detecting a new provider publish, fan-out, retries with backoff, idempotent ingestion, deduplication, and backfill of missed hours, while respecting provider **rate limits**. 5. Lay out the **caching and invalidation plan** (TTLs, write-through, event-driven invalidation, optional CDN) that meets the 10-minute freshness SLA, plus the API-side freshness gate. 6. Cover **scaling and partitioning** for traffic spikes, and provide rough **QPS and storage estimates**. 7. Describe **deployment, cost, and security** considerations. 8. Describe the **test and observability** strategy: monitoring/alerting on freshness SLOs, failure handling for upstream slowness / partial failures / outages, and graceful degradation. 9. Discuss **multi-region availability and disaster recovery** (RPO/RTO).

A HubSpot onsite system-design question: build a weather data platform that ingests hourly dumps for ~2,000 U.S. locations and serves the current temperature with an at-most-10-minute freshness SLA. It tests ingestion pipelines, idempotent retries and backfill, caching and invalidation, storage modeling, rate-limit handling, scaling, and observability on freshness SLOs.

How do I approach System Design interview questions?

System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master system design interviews.

What difficulty level is this interview question?

This is a hard difficulty System Design question, commonly asked during Onsite rounds at HubSpot.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at HubSpot during technical interviews.

Design a near-real-time weather API | HubSpot Interview Question

Question

Design a weather data platform that retrieves best-effort hourly dumps from an external Weather Service API for approximately 2,000 locations across the U.S. and exposes an API to return the current temperature for a requested location.

Hard requirement: data served must be at most 10 minutes stale relative to the provider's hourly publish time.

The system should:

Define a high-level architecture : ingestion, storage, cache, read API, and scheduler/orchestrator.
Design the API contract : e.g. GET /temperature?locationId=... (and optionally lat,lon ), the response schema, error cases, freshness metadata, and versioning. Support lookup by internal id and by nearest covered location for lat,lon .
Choose a data model and storage (including the location schema, TTL/retention, indexing, and partitioning strategy).
Specify the ingestion pipeline : detecting a new provider publish, fan-out, retries with backoff, idempotent ingestion, deduplication, and backfill of missed hours, while respecting provider rate limits .
Lay out the caching and invalidation plan (TTLs, write-through, event-driven invalidation, optional CDN) that meets the 10-minute freshness SLA, plus the API-side freshness gate.
Cover scaling and partitioning for traffic spikes, and provide rough QPS and storage estimates .
Describe deployment, cost, and security considerations.
Describe the test and observability strategy: monitoring/alerting on freshness SLOs, failure handling for upstream slowness / partial failures / outages, and graceful degradation.
Discuss multi-region availability and disaster recovery (RPO/RTO).

Question

Hard requirement: data served must be at most 10 minutes stale relative to the provider's hourly publish time.

The system should:

Define a high-level architecture : ingestion, storage, cache, read API, and scheduler/orchestrator.
Design the API contract : e.g. GET /temperature?locationId=... (and optionally lat,lon ), the response schema, error cases, freshness metadata, and versioning. Support lookup by internal id and by nearest covered location for lat,lon .
Choose a data model and storage (including the location schema, TTL/retention, indexing, and partitioning strategy).
Specify the ingestion pipeline : detecting a new provider publish, fan-out, retries with backoff, idempotent ingestion, deduplication, and backfill of missed hours, while respecting provider rate limits .
Lay out the caching and invalidation plan (TTLs, write-through, event-driven invalidation, optional CDN) that meets the 10-minute freshness SLA, plus the API-side freshness gate.
Cover scaling and partitioning for traffic spikes, and provide rough QPS and storage estimates .
Describe deployment, cost, and security considerations.
Describe the test and observability strategy: monitoring/alerting on freshness SLOs, failure handling for upstream slowness / partial failures / outages, and graceful degradation.
Discuss multi-region availability and disaster recovery (RPO/RTO).

Design a near-real-time weather API

Quick Overview

Question

Solution

Submit Your Answer to Earn 20XP

Design a near-real-time weather API

Quick Overview

Question

Solution

Submit Your Answer to Earn 20XP