PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/ML System Design/Airtable

Design NL-to-Formula assistant for Airtable

Last updated: Jun 15, 2026

Quick Overview

Airtable software-engineer onsite ML system design question: design an assistant that turns natural-language requests into the product's formula language, grounded in the live table schema, and optionally applies them to the base via the Airtable API. It tests LLM orchestration with structured outputs / a typed IR, schema grounding and type checking, ambiguity handling, static + dynamic formula validation, prompt-injection safety, confirmation-gated writes, latency trade-offs, and offline/online evaluation.

  • medium
  • Airtable
  • ML System Design
  • Software Engineer

Design NL-to-Formula assistant for Airtable

Company: Airtable

Role: Software Engineer

Category: ML System Design

Difficulty: medium

Interview Round: Onsite

##### Question Design an ML-powered assistant for a no-code table product like Airtable that converts a user's **natural-language request** into a **formula expression** in the product's formula language, grounded in the live table's schema. You are given an **Airtable API key** (read/write access to a base/table) and an **LLM API key** (e.g., Claude) you can call. Users type requests such as: - “Add 2 to `salary`.” - “If `Status` is "active", set `bonus` to 10% of `salary`.” - “Create a column that is `salary` * 1.1.” - “If `Status` is "Won" and `Amount` > 1000, return `Amount` * 0.1, else 0.” - “Extract the domain from the `Email` field.” - “Label a row "Late" if `Due Date` is before today and `Completed` is false.” Walk through the design covering: 1. **End-to-end architecture.** Components and data flow across the client/UI, an orchestrator API, the LLM layer, the Airtable connector, validators/tools, and observability. 2. **NL → formula generation.** Prompting and tool/structured-output strategy that grounds generation in the actual field names and types. Discuss using a typed intermediate representation (IR) that you compile deterministically into the formula string to cut down syntax errors. 3. **Output contract.** Beyond the proposed formula, return a short **explanation** and optionally a few **test cases / example evaluations** the user can verify. 4. **Schema grounding and type checking.** Pass field names and types (number, currency, date, bool, single-select, text, plus null/blank) into the model; forbid invented fields and functions; require references to match the real schema. 5. **Ambiguity handling.** Ask clarifying questions when the target field is unclear (e.g. `Base Salary` vs `Salary (USD)`), the business logic is ambiguous (what does “late” mean — timezone, inclusivity?), or the output type is unclear. Support multi-turn refinement. 6. **Validation before any write.** Static checks (parse to AST, grammar validation, referenced-field existence, operator/argument type-checking) and dynamic checks (sandbox-evaluate on a small sample of rows to catch divide-by-zero, null/blank handling, runtime errors). On failure, feed the error back to the LLM for a bounded repair loop. 7. **Applying changes to the live sheet via the Airtable API.** Support two execution modes: (a) **create/update a formula field** (the preferred, non-destructive path), and (b) **materialize computed values** into a target field when formula fields aren't allowed. Show a preview/diff on sample rows and require user confirmation before applying to the full dataset. 8. **Safety.** Treat user text as untrusted; never hand the model credentials — the model only requests actions a server-side tool executes. Defend against prompt injection and data exfiltration, redact PII in samples, enforce a capability/operation whitelist, and gate destructive actions (overwriting non-formula fields, bulk updates above a threshold, schema changes) behind explicit confirmation. 9. **Latency and reliability.** Target an interactive P95 of ~2–3s: cache schema summaries by schema hash, use a small/fast model on the first pass and escalate to a larger model on failure, cap repair iterations, and stream the explanation while the formula finalizes. For writes, use idempotency keys, exponential backoff on 429/5xx, batched writes, and partial-failure reporting (failed record IDs). 10. **Evaluation.** Offline: a labeled set of (schema, request) → expected formula scored by AST equivalence and execution accuracy on test rows, plus validation pass-rate. Online: acceptance/insert-without-edit rate, edit distance from suggestion to final, clarification rate, preview-to-apply conversion, turns-to-success, and post-apply undo/correction rate. 11. **Edge cases and failure modes.** Hallucinated fields/functions, wrong operator precedence, null/blank values (`IF({Salary}, {Salary}+2, BLANK())`), type mismatches, locale/percent/currency parsing (“10%” → 0.1), timezone handling, and schema changing between preview and apply (prefer field IDs over names).

Quick Answer: Airtable software-engineer onsite ML system design question: design an assistant that turns natural-language requests into the product's formula language, grounded in the live table schema, and optionally applies them to the base via the Airtable API. It tests LLM orchestration with structured outputs / a typed IR, schema grounding and type checking, ambiguity handling, static + dynamic formula validation, prompt-injection safety, confirmation-gated writes, latency trade-offs, and offline/online evaluation.

Airtable logo
Airtable
Nov 4, 2025, 12:00 AM
Software Engineer
Onsite
ML System Design
56
0
Question

Design an ML-powered assistant for a no-code table product like Airtable that converts a user's natural-language request into a formula expression in the product's formula language, grounded in the live table's schema. You are given an Airtable API key (read/write access to a base/table) and an LLM API key (e.g., Claude) you can call.

Users type requests such as:

  • “Add 2 to salary .”
  • “If Status is "active", set bonus to 10% of salary .”
  • “Create a column that is salary * 1.1.”
  • “If Status is "Won" and Amount > 1000, return Amount * 0.1, else 0.”
  • “Extract the domain from the Email field.”
  • “Label a row "Late" if Due Date is before today and Completed is false.”

Walk through the design covering:

  1. End-to-end architecture. Components and data flow across the client/UI, an orchestrator API, the LLM layer, the Airtable connector, validators/tools, and observability.
  2. NL → formula generation. Prompting and tool/structured-output strategy that grounds generation in the actual field names and types. Discuss using a typed intermediate representation (IR) that you compile deterministically into the formula string to cut down syntax errors.
  3. Output contract. Beyond the proposed formula, return a short explanation and optionally a few test cases / example evaluations the user can verify.
  4. Schema grounding and type checking. Pass field names and types (number, currency, date, bool, single-select, text, plus null/blank) into the model; forbid invented fields and functions; require references to match the real schema.
  5. Ambiguity handling. Ask clarifying questions when the target field is unclear (e.g. Base Salary vs Salary (USD) ), the business logic is ambiguous (what does “late” mean — timezone, inclusivity?), or the output type is unclear. Support multi-turn refinement.
  6. Validation before any write. Static checks (parse to AST, grammar validation, referenced-field existence, operator/argument type-checking) and dynamic checks (sandbox-evaluate on a small sample of rows to catch divide-by-zero, null/blank handling, runtime errors). On failure, feed the error back to the LLM for a bounded repair loop.
  7. Applying changes to the live sheet via the Airtable API. Support two execution modes: (a) create/update a formula field (the preferred, non-destructive path), and (b) materialize computed values into a target field when formula fields aren't allowed. Show a preview/diff on sample rows and require user confirmation before applying to the full dataset.
  8. Safety. Treat user text as untrusted; never hand the model credentials — the model only requests actions a server-side tool executes. Defend against prompt injection and data exfiltration, redact PII in samples, enforce a capability/operation whitelist, and gate destructive actions (overwriting non-formula fields, bulk updates above a threshold, schema changes) behind explicit confirmation.
  9. Latency and reliability. Target an interactive P95 of ~2–3s: cache schema summaries by schema hash, use a small/fast model on the first pass and escalate to a larger model on failure, cap repair iterations, and stream the explanation while the formula finalizes. For writes, use idempotency keys, exponential backoff on 429/5xx, batched writes, and partial-failure reporting (failed record IDs).
  10. Evaluation. Offline: a labeled set of (schema, request) → expected formula scored by AST equivalence and execution accuracy on test rows, plus validation pass-rate. Online: acceptance/insert-without-edit rate, edit distance from suggestion to final, clarification rate, preview-to-apply conversion, turns-to-success, and post-apply undo/correction rate.
  11. Edge cases and failure modes. Hallucinated fields/functions, wrong operator precedence, null/blank values ( IF({Salary}, {Salary}+2, BLANK()) ), type mismatches, locale/percent/currency parsing (“10%” → 0.1), timezone handling, and schema changing between preview and apply (prefer field IDs over names).

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Airtable•More Software Engineer•Airtable Software Engineer•Airtable ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.