System Design: Minimal ChatGPT-like Service With Reusable Presets
Context
Design a multi-tenant conversational AI service that supports reusable presets (system prompts) at user and team scope. A preset can be versioned, shared, approved, applied to a chat session, and soft-deleted. The service should expose public APIs, manage session state within context windows, and operate reliably and cost-effectively at scale.
Requirements
Specify the following:
-
Core components and their responsibilities.
-
Data model for presets:
-
Fields and metadata.
-
Ownership and scope (user, team, org, public).
-
Versioning and soft delete.
-
How presets are referenced in a chat session.
-
Public APIs:
-
CRUD for presets, list and search, apply-to-session, clone and share.
-
Request and response examples.
-
Auth and RBAC considerations.
-
Session state handling and context-window management:
-
How messages and presets combine into prompts.
-
Truncation and summarization strategy.
-
Storage choices for messages and presets, indexing, and caching.
-
Scalability plan:
-
Stateless workers, queueing and backpressure, autoscaling.
-
Latency and SLO targets.
-
Cost controls.
-
Multi-tenant isolation, rate limiting, and quota.
-
Safety and privacy controls:
-
PII redaction, preset approval flow.
-
Audit logging and observability.
-
Failure modes and fallbacks:
-
Missing preset, model timeout, partial outage.
-
Extensibility:
-
A B testing different presets, preset libraries, shareable links.
-
Capacity estimates for 100k DAU.
-
Sequence of calls for:
-
(a) Create a preset.
-
(b) Send a message using a preset.