Nodal Context is the open-source, interview-built context layer for analytics agents. It builds that context with your analyst — one domain at a time — and writes it to a git repo your team reviews by PR. Apache-2.0. Runs on your stack with Claude Code, Codex, or Cursor.
Open source · Apache-2.0 · coming soon
The obvious approach — and what most tools do — is to ingest your warehouse, dbt, BI layer, and query history and auto-generate the context. The teams who've measured it found it doesn't work as a source of truth.
Anthropic's own data team reported that auto-generating metric definitions from raw tables and query logs "encoded the very ambiguities we were trying to eliminate" — and was net-negative on evals versus a smaller, human-curated layer. They also gave an agent grep access to thousands of prior queries; accuracy moved less than a point. The information was present, the agent saw it, and it still didn't resolve the question to the right entity.
Their conclusion: generate the documentation with the model, but have a human own the definition. That's exactly what the interview does.
Scraped from schema and query logs. Encodes the same ambiguities it was meant to remove. Nobody owns it, so nobody trusts it.
The model drafts; the analyst owns every definition. We auto-extract schema and dbt as a draft to correct — but the analyst's confirmations, not the extraction, are what we trust.
Three of these you already have in your stack. The fourth — what your metrics actually mean — is the one that decides whether the rest works, and the one Nodal Context builds.
Snowflake, BigQuery, or Redshift — with a dbt project sitting on top. The system of record the agent queries against.
Column-level flow from raw tables through transformations to dashboards — so the agent knows where every number actually comes from.
dbt project, DAG pipelines (Airflow, Dagster), and scripts repo — the queries your team has already written are the ground truth.
What metrics actually mean — the piece most teams skip. This is what the interview captures, with your analyst owning every definition.
The skill runs a structured interview with the analyst and writes the answers down. It asks one thing at a time, in business language — these are working analysts, not a blank YAML file.
Auto-extract schema and dbt into a draft the analyst reacts to — so they're correcting, not staring at a blank page. Nothing unconfirmed is trusted.
What the business does, how it makes money, and the handful of terms that get misunderstood — captured with the meaning the analyst confirms.
"List the dashboards your team maintains." Each cluster becomes a domain the agent should know — with its tables, grain, and business context.
Disambiguate the terms that map to data values — "provider" as an individual clinician versus a care-provider company — so the agent routes to the right one.
"Where would an obvious query give a plausibly wrong answer?" The silent-failure modes only the analyst knows — the piece that decides whether the rest works.
Answer sample questions with context off and on, against the live warehouse, and confirm against the dashboard — so the analyst sees the context work before committing to the next domain.
The interview writes the Analytics Context Format (ACF): git-friendly YAML and Markdown your team reviews by PR. The file the agent actually reads is written for an LLM — explicit routing, not prose.
IF … DO NOT … use … routing, grain, exclusions, and wrong-answer modes
Every disambiguation the analyst makes in the interview — "active client means X, not Y" — is simultaneously a context entry and a labeled eval pair. One confirmation, two assets.
question: "What's our collection rate for Payer X last quarter?"
intent: collection rate on adjudicated claims; "Payer X" resolves state-specifically (TX vs FL), not aggregated; exclude sessions < 45 days old.
provenance: interview · status: confirmed
A format-agnostic harness runs your agent three ways — context off, context on, and against ground truth — and reads off the delta. It accepts ACF, dbt docs, or raw markdown. Bring whatever context you already have.
34 confirmed seeds from the interview