Skip to content

The schema artifact

The contract

The schema artifact is the single file (or directory) your team authors, reviews, and checks into version control. It’s the context the model uses to generate SQL — and it’s yours to evolve like any other piece of source code.

A schema artifact is a folder with one required file and a few optional companions:

my-app.schema/
schema.json # required — physical structure
tables/
users.md # optional — per-table descriptions, aliases, concepts
orders.md
concepts.md # optional — cross-table business concepts
schema.lock.json # optional — bundle integrity

Only schema.json is required. An artifact with no markdown is valid; you can add enrichment incrementally.

When you need one portable file (for distributing to a service, an agent, or a build artifact), bundle the directory:

Terminal window
npx askdb bundle my-app.schema --out my-app.schema.bundle.json

schema.json holds the structural facts: stable IDs, physical table and column names, types, nullability, primary keys, relationships, and sensitivity flags.

{
"version": 2,
"schemaId": "my-app",
"tables": [
{
"id": "table:users",
"name": "users",
"columns": [
{
"id": "table:users#email",
"name": "email",
"type": "text",
"nullable": false,
"sensitive": true
}
]
}
]
}

The physical layer is regenerated by askdb introspect — you don’t hand-edit it.

Per-table markdown adds the meaning the model needs to generate good SQL: descriptions, aliases, common query language, example questions, and column-level notes.

---
id: table:orders
name: orders
schemaId: my-app
aliases: [purchases, sales]
columns:
- id: table:orders#status
enum: [pending, paid, shipped, cancelled]
description: Order lifecycle state.
---
# Table: orders
Customer purchase orders. One row per submitted order.
## Common query language
- "sales" usually means paid orders
- "active customers" excludes status = cancelled

This layer survives re-introspection. When your database changes, askdb introspect updates schema.json and leaves your markdown intact.

The model never sees the directory you author. At generation time AskDB merges the physical layer and the enrichment into a single compact, model-facing text — a DDL-style description where each table is rendered as a block and your enrichment is folded in as inline comments (descriptions, aliases, and “common query language” attached to the table and columns they describe). Sensitive fields are tagged or withheld here according to your mode.

So the two layers you maintain separately on disk — schema.json for structure, markdown for meaning — are assembled into one annotated schema description the model reads. That repackaged text, plus the question, is the entire prompt: never your rows, your credentials, or your query results. When a schema is too large to send whole, the same rendering is applied to only the relevant slice retrieved for the question (see RAG for large schemas).

Every table, column, and concept has a stable ID. Renaming a column in the database doesn’t break the enrichment that references it — the ID stays the same.

IDRefers to
table:usersA table
table:users#emailA column
concept:customerA cross-table business concept

The schema artifact is a deliberate choice. The alternative — a prompt template buried in application code — is opaque, hard to review, and tangled with the rest of your codebase.

A versionable file means:

  • Your data team can review changes in a PR like any other code.
  • You can see, line by line, what context the model receives.
  • Enrichment evolves at its own pace, decoupled from app deploys.
  • The same artifact powers the CLI, the library, and the HTTP API — one source of truth.