Design a task system with assignments
Company: Instacart
Role: Software Engineer
Category: Software Engineering Fundamentals
Difficulty: medium
Interview Round: Technical Screen
You are implementing an **in-memory task management system** (no database required — everything lives in process memory and is rebuilt on restart). The system needs basic CRUD operations plus a ranking/sorting capability for listing tasks.
The defining requirement is that a single **Task** can be assigned to **multiple people**, and assignment-specific information (who, when) must not live on the `Task` itself. You will model the "task assigned to a person" relationship as a first-class **Assignment** concept.
This is a low-level / object-oriented design exercise: the interviewer cares about your class boundaries, your in-memory data structures, the interfaces of your operations, and how you reason about deletion semantics, ordering, and production hardening — not about a specific language or framework.
### Constraints & Assumptions
- Pure in-memory; no persistence layer, no external store.
- A `Task` can have **0..N** assignments; an assignee can be on many tasks.
- Single process. Assume single-threaded for the core design unless the interviewer asks about concurrency (Part C).
- Scale is modest (thousands of tasks, not millions) — favor clarity, but be ready to discuss what changes at larger N.
- IDs are server-generated and unique (UUID or a monotonic counter).
### Clarifying Questions to Ask
Ask these up front to scope the whole problem before writing any code:
- Is `status` (e.g. OPEN/DONE) a property of the **task as a whole**, or **per assignee** (one person finished, another hasn't)? This decides where `status` lives.
- When a task is deleted, should its assignments be **cascade-deleted**, should deletion be **blocked** while assignments exist, or should it be a **soft delete** preserving history?
- What is the primary **ranking key** for listing tasks (priority? due/end time? creation order? a composite?), and is the order **ascending or descending**?
- Can the same person be assigned to the same task **twice** (e.g. two time windows), or is `(taskId, assigneeId)` unique?
- Are we ranking **tasks** globally, or "tasks for a given assignee"? They can require different indexes.
- What should happen on **invalid input** — e.g. `startTime > endTime`, or creating an assignment for a non-existent task?
### Part A — Core data model
Design the classes / data structures. `Task` should hold only **general, task-wide metadata** (`taskId`, `title`, `description`, `createdAt`, `priority`, etc.). Each **Assignment** represents "this task assigned to this person" and holds assignment-specific fields (`assignmentId`, `taskId`, `assigneeId`, `startTime`, `endTime`, and any other per-assignee facts). Explain which fields belong on `Task` vs `Assignment` and **why**, and how you will **link** assignments to their task (IDs vs object references) and what in-memory indexes you keep.
```hint Where to start
Run the "would this field be wrong if the task had two assignees?" test on each field. `title`/`priority` are shared → `Task`. `assigneeId`/`startTime`/`endTime` differ per person → `Assignment`. `status` depends on the clarifying answer.
```
```hint Data structures
Keep entity maps flat (keyed by ID) and add **reverse indexes** so that "assignments for a task" and "tasks for a person" are O(1) lookups rather than O(N) scans. Link by ID, not object reference — references make deletes fragile.
```
#### What This Part Should Cover
- A clean rationale for the split: shared/task-wide facts on `Task`, per-assignee facts on `Assignment`, with the explicit "two assignees" argument for why folding them into `Task` breaks.
- A concrete in-memory layout: ID-keyed maps for the entities **plus** at least one reverse index so that "assignments for a task" and "tasks for a person" are not O(N) scans.
- Correct placement of the ambiguous fields (especially `status`) justified against the clarifying answer.
### Part B — Operations (CRUD + ranking)
Specify and implement (at least at interface level) the operations: create/read/update/delete **tasks**; create/update/delete **assignments**; and **list tasks** with a ranking/ordering requirement (sort by priority, due/end time, created time, or a composite key). Then resolve the two decisions the prompt forces: **what happens when you delete a task that still has assignments**, and **what your ranking key is and whether ties are stable**.
```hint Operation signatures
Make each mutation return enough to chain (`createTask(...) -> taskId`, `createAssignment(taskId, assigneeId, start, end) -> assignmentId`). Validate referential integrity in `createAssignment` (the `taskId` must exist) before inserting.
```
```hint Deletion
Deleting a task must also clean up the reverse indexes for every child assignment, or you leak dangling `assignmentId`s. Pick one policy (cascade / restrict / soft-delete) and apply it consistently — don't half-delete.
```
```hint Ranking & ties
A plain comparator over the task list is fine at this scale: `sort by priority DESC, then endTime ASC`. Make ordering **total** by appending a deterministic final tiebreaker (e.g. `taskId`) so equal-key tasks have a stable, repeatable order. If they ask for frequent "top-K", reach for a heap/priority queue.
```
#### What This Part Should Cover
- A complete, named set of CRUD interfaces for both entities, with read paths that use the reverse indexes (list-by-task, list-by-assignee).
- A **stated** delete policy with the index cleanup it implies, and an acknowledgment of the trade-off (cascade is simplest; soft-delete preserves audit).
- A **total, deterministic** ranking: an explicit composite key, an explicit direction, and a tiebreaker that makes ties stable; plus handling of missing fields (e.g. tasks with no due date).
### Part C — Production / refactor discussion
If this were production code, how would you refactor it? Discuss **layering** (API / service / storage), **validation and invariants**, **error handling and contracts**, **testability**, **concurrency**, and how the design **evolves** as requirements change.
```hint Seams
The single biggest refactor is hiding the maps behind a `Repository`/store interface so the service layer never touches storage directly — that one seam buys you a future DB swap, mockable tests, and a place to put locking.
```
#### What This Part Should Cover
- A three-layer split (API/controller → service/business rules → repository/store) with the store interface as the swappable seam toward a real database.
- Concrete invariants and validation (`startTime <= endTime`, task-exists-before-assignment, optional `(taskId, assigneeId)` uniqueness) and a typed error contract (`NotFound` / `ValidationError` / `Conflict`).
- A thread-safety story (per-store or per-task locking, or concurrent maps with care around compound read-modify-write) and a testability plan (unit tests for ranking and delete semantics, contract tests for the repository).
- An evolvability argument showing the model absorbs likely changes (per-assignee completion, reassignment history/audit, new filter/search indexes) without reshaping `Task`.
### Follow-up Questions
- Suppose listing tasks by priority becomes a hot path and N grows large. How do you avoid re-sorting the whole collection on every call — what index would you maintain, and what does each mutation now cost?
- A new requirement: tasks have **dependencies** (task B can't start until task A is done). How does your model change, and how would you detect a dependency cycle?
- The interviewer adds **per-assignee completion** and then asks for "tasks where everyone has completed." How is that query served efficiently from your data structures?
- How would you support **reassignment with full history** (who was assigned, when, and why it changed) without losing the current "active assignments" fast path?
Quick Answer: This question evaluates data modeling, API/interface design, and system-design competencies—focusing on entity relationships (Task versus Assignment), CRUD semantics, and ranking/sorting behavior in an in-memory task management system.