Design Search And Web Crawling Systems
Company: Meta
Role: Software Engineer
Category: System Design
Difficulty: medium
Interview Round: Technical Screen
Design the following two large-scale systems.
1. Social-network search: Design a Facebook-like search product that lets users search for people, pages, groups, and posts. The system should support low-latency queries, relevance ranking, personalization, privacy enforcement, and reasonably fresh results as new content is created or updated.
2. Distributed web crawler: Design a web crawling platform that can run on roughly 10,000 machines. The system should discover and fetch web pages at scale, deduplicate URLs and page content, respect per-site crawling limits, use distributed caching where appropriate, and reason about network throughput, storage, and machine-count estimates.
Quick Answer: This question evaluates system-design competencies such as scalable search architecture, indexing and ranking, personalization and privacy enforcement, freshness, large-scale crawling, URL and content deduplication, and distributed resource estimation.