Assume you are asked to design the backend for a news aggregation and feed service.
Requirements:
-
The system pulls articles from multiple third-party news providers via their HTTP APIs (not by crawling raw HTML).
-
Each article belongs to one or more categories (e.g., sports, politics). Users can "follow" categories, similar to following friends, and then see a feed of recent articles from the categories they follow.
-
Users can read articles, and the system should record which articles have been read.
Design the system and focus on:
-
The high-level architecture for ingesting news from external APIs, storing it, and serving user feeds.
-
The data schema for key entities (sources, articles, categories, user subscriptions, read history, etc.).
-
The end-to-end flow when a user opens the app and reads their news feed.
-
How you would ensure scalability, reasonable freshness of news, and resilience to external API failures.