PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/OpenAI

Design a distributed crossword solving service

Last updated: Mar 29, 2026

Quick Overview

This question evaluates distributed systems and large-scale backend architecture skills, including task decomposition, scheduling, dynamic load balancing, fault tolerance, and state management for parallel search workloads in the System Design domain, and focuses on the practical application of architectural design and distributed coordination rather than purely algorithmic implementation. It is commonly asked to assess an engineer's ability to design APIs, orchestrators, worker fleets, and storage strategies that scale and remain correct under failures, demonstrating reasoning about task granularity, work stealing, and job completion detection for expensive combinatorial search problems.

  • hard
  • OpenAI
  • System Design
  • Software Engineer

Design a distributed crossword solving service

Company: OpenAI

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

You are asked to design a backend service that can solve crossword puzzles at scale. A single **puzzle instance** consists of: - A crossword **board** (grid) with numbered Across/Down slots and black squares. - A **dictionary** of valid words, each with a short definition, for example: - `TIRE`: "A wheel on a car" - `RATS`: "Small rodents" - `BATH`: "A tub to clean in" - `BE`: "To, or not to" The system should fill the crossword board with valid words from the dictionary that satisfy all crossing constraints. The final output for a puzzle is a list of filled clues, for example: - `1 Down: A wheel on a car` - `2 Across: Small rodents` - `3 Across: To, or not to` - `3 Down: A tub to clean in` You can assume there exists some backtracking / constraint-satisfaction algorithm that, given a board and dictionary, can search the solution space and determine one or more valid fillings. However, for large boards this search becomes extremely expensive, and a single machine/core doing a naive brute-force search will not finish in acceptable time. Design a **distributed service** that can solve very large crossword puzzles by **splitting the search work across many worker machines**. In particular, address: 1. **API and basic flow** - How clients submit a puzzle (board + dictionary) and retrieve results. - Whether solving should be synchronous or asynchronous. 2. **High-level architecture** - Main components: API layer, coordinator/orchestrator, task queues, worker services, storage. - Where the crossword-solving algorithm runs and how it interacts with the rest of the system. 3. **Job and task scheduling** - How you break down a single large puzzle-solving job into smaller **tasks** that can be executed in parallel. - How tasks represent parts of the search space (e.g., partial board assignments, subsets of words, subsets of constraints). - How workers pick up, execute, and generate new tasks. 4. **Dynamic load balancing and task splitting** - Some tasks will quickly hit a dead end in the search and finish almost immediately. - Other tasks may explode into a huge search subtree and run for a long time. - Describe how you: - Avoid having some machines idle while others are stuck on huge tasks. - Dynamically split heavy tasks into smaller ones and redistribute them. - Potentially use work stealing or other techniques to balance the load. 5. **Scalability and performance** - How you scale to many concurrent puzzles and very large boards. - How you choose task granularity to balance overhead versus parallelism. 6. **Fault tolerance and correctness** - How you handle worker failures, retries, and ensure that tasks are not lost. - How you avoid or handle duplicate task execution. - How you detect that a puzzle is fully solved or unsolvable and finalize the job. 7. **Data and state management** - Where you store puzzle definitions, dictionaries, intermediate states/partial solutions, and final solutions. - How you track the progress of each puzzle-solving job. Clearly explain your design assumptions, trade-offs, and any heuristics you would use to make this practical in production. Focus on the **distributed job scheduling and task management** aspects rather than the detailed crossword-solving algorithm itself.

Quick Answer: This question evaluates distributed systems and large-scale backend architecture skills, including task decomposition, scheduling, dynamic load balancing, fault tolerance, and state management for parallel search workloads in the System Design domain, and focuses on the practical application of architectural design and distributed coordination rather than purely algorithmic implementation. It is commonly asked to assess an engineer's ability to design APIs, orchestrators, worker fleets, and storage strategies that scale and remain correct under failures, demonstrating reasoning about task granularity, work stealing, and job completion detection for expensive combinatorial search problems.

Related Interview Questions

  • Design Video Generation Orchestration - OpenAI (medium)
  • Design CI/CD Build Caching - OpenAI
  • Design an Instagram-like Feed System - OpenAI (medium)
  • Design Online Chess Matchmaking - OpenAI (hard)
  • Design Android MVVM API Architecture - OpenAI (medium)
OpenAI logo
OpenAI
Oct 17, 2025, 12:00 AM
Software Engineer
Technical Screen
System Design
22
0
Loading...

You are asked to design a backend service that can solve crossword puzzles at scale.

A single puzzle instance consists of:

  • A crossword board (grid) with numbered Across/Down slots and black squares.
  • A dictionary of valid words, each with a short definition, for example:
    • TIRE : "A wheel on a car"
    • RATS : "Small rodents"
    • BATH : "A tub to clean in"
    • BE : "To, or not to"

The system should fill the crossword board with valid words from the dictionary that satisfy all crossing constraints. The final output for a puzzle is a list of filled clues, for example:

  • 1 Down: A wheel on a car
  • 2 Across: Small rodents
  • 3 Across: To, or not to
  • 3 Down: A tub to clean in

You can assume there exists some backtracking / constraint-satisfaction algorithm that, given a board and dictionary, can search the solution space and determine one or more valid fillings. However, for large boards this search becomes extremely expensive, and a single machine/core doing a naive brute-force search will not finish in acceptable time.

Design a distributed service that can solve very large crossword puzzles by splitting the search work across many worker machines. In particular, address:

  1. API and basic flow
    • How clients submit a puzzle (board + dictionary) and retrieve results.
    • Whether solving should be synchronous or asynchronous.
  2. High-level architecture
    • Main components: API layer, coordinator/orchestrator, task queues, worker services, storage.
    • Where the crossword-solving algorithm runs and how it interacts with the rest of the system.
  3. Job and task scheduling
    • How you break down a single large puzzle-solving job into smaller tasks that can be executed in parallel.
    • How tasks represent parts of the search space (e.g., partial board assignments, subsets of words, subsets of constraints).
    • How workers pick up, execute, and generate new tasks.
  4. Dynamic load balancing and task splitting
    • Some tasks will quickly hit a dead end in the search and finish almost immediately.
    • Other tasks may explode into a huge search subtree and run for a long time.
    • Describe how you:
      • Avoid having some machines idle while others are stuck on huge tasks.
      • Dynamically split heavy tasks into smaller ones and redistribute them.
      • Potentially use work stealing or other techniques to balance the load.
  5. Scalability and performance
    • How you scale to many concurrent puzzles and very large boards.
    • How you choose task granularity to balance overhead versus parallelism.
  6. Fault tolerance and correctness
    • How you handle worker failures, retries, and ensure that tasks are not lost.
    • How you avoid or handle duplicate task execution.
    • How you detect that a puzzle is fully solved or unsolvable and finalize the job.
  7. Data and state management
    • Where you store puzzle definitions, dictionaries, intermediate states/partial solutions, and final solutions.
    • How you track the progress of each puzzle-solving job.

Clearly explain your design assumptions, trade-offs, and any heuristics you would use to make this practical in production. Focus on the distributed job scheduling and task management aspects rather than the detailed crossword-solving algorithm itself.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More OpenAI•More Software Engineer•OpenAI Software Engineer•OpenAI System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.