Is your AI analytics actually answering correctly today?

Nodal tests your self-service analytics system end-to-end against your real data, dbt project, and business context — using test cases built automatically from work your team has already done.

See Nodal evaluations in action

How Nodal builds a test suite from your BI dashboards, dbt project, and business-context repo — then runs headless agents in parallel against ground truth, surfacing pass, fail, drift, cost, and root cause.

Nobody runs evals on AI analytics.

Eval tools score models on public benchmarks in isolation. But production AI analytics depends on the agent, your dbt project, your warehouse, and your business context working together — the same way users hit it on Monday morning. Most teams know this matters, but it never gets done.

  • Writing 100 test cases by hand is the part nobody does — so nobody runs evals.
  • Stale and/or changing documentation silently break self-service analytics.
  • Non-technical users ask under-specified questions; nobody sees what was asked, what was assumed, or where the system guessed.

The unlock isn't a better model. It's a combination of sophisticated eval suite and observability.

Evaluations first. Then observability. Then alignment.

For Data Leaders

Prove your AI analytics is working — every day

An eval suite built from artifacts your team already maintains, re-running on every change.

  • Test cases generated from BI dashboards, dbt tests and metrics, and the business-context repo — no hand-written cases.
  • Re-runs on every schema migration, dbt change, or doc update.
  • Drift attributed to the specific commit that caused it, with affected questions listed.
  • A defensible answer to "is it working?" — backed by a test run from this morning.
Drift Detected — 4 questions affected

Since your last dbt update, accuracy on customer segmentation questions dropped from 87%61%.

The 3 questions most affected all involve the "enterprise" account definition.

Root cause

dim_accounts.account_tier definition changed in commit a3f8c2d (April 7, 2026)

Proposed fix

Update the "enterprise" entity definition in business-context to match the new account_tier values.

PR business-context#42 — Update enterprise account definition to align with dim_accounts.account_tier refactor Ready for review
Review PR View affected questions View dbt diff
Observability

Catch under-specified questions before they get wrong answers

Non-technical users don't ask fully-specified questions. Nodal makes the gaps visible before SQL runs.

  • Question reframed with defaults from your documentation; assumptions shown in brackets the user can change.
  • Confidence score from auditable signals — entity resolution, schema grounding, doc coverage, context freshness.
  • You approve the interpretation, not the SQL.
  • Every under-specified question becomes a signal — and a candidate test case for the eval suite.
For Data Teams

Cost-benefit analysis for every piece of your context

The eval suite is also a measurement tool — for the docs you maintain and the models you pay for.

  • Failed test cases point at the specific docs and definitions causing wrong answers — a prioritized list grounded in what actually broke.
  • Ablation tests on each context source: drop a data dictionary, a Notion page, a glossary entry — measure the answer-quality delta against the token-cost delta.
  • Model trade-off tests: swap Claude for a cheaper model, Codex for Gemini — read off pass rate vs. cost per run.
  • Cost optimization stops being a guess.
Documentation health report
67% of answered questions relied on dbt column descriptions
23% used Confluence documentation — but 40% of those pages hadn't been updated in over a year
15% lower consistency on questions grounded in stale docs

Don't have self-service AI analytics yet?

Before evals matter, four pieces have to be in place. The video walks through each — and why the one most teams skip is the one that decides whether everything else works.

  • Data warehouse — Snowflake, BigQuery, Redshift.
  • dbt project for lineage, models, and tests.
  • Optional scripts repo for harder analyses.
  • Business-context layer for what metrics actually mean — the piece most teams skip.
Talk to us about getting started

Stop guessing whether your AI analytics is correct.

Request Demo