Docs / Methodology

How the assessment works

The scoring is designed to be defensible in front of a board: every number traces to measured metadata, and anything we could not measure says so instead of pretending.

The 6C framework

Assessments score data against six factors, adapted from the Snowflake Labs AI-Ready Data Framework (Apache 2.0) and aligned at the characteristic level with ISO/IEC 5259, the international standard for ML data quality.

Clean

Accurate, complete, free of errors that compromise AI output. Requires inspecting data values, so a metadata-only scan reports these requirements as not assessable rather than guessing.

Contextual

Meaning is explicit: documentation, identifiers, types, declared relationships. No tribal knowledge required for AI to interpret the schema.

Consumable

Right format and access path for the workload: partitioning, indexing, embeddings, search optimization.

Current

Freshness enforced by infrastructure: change detection, update cadence, versioning.

Correlated

Traceable from source to every AI decision: provenance, lineage, audit.

Compliant

Governed: classification, masking, row-level security, ownership, retention, consent.

Requirements and profiles

The factors break down into 50 distinct requirements. Each AI workload profile (Estate Scan, RAG, Agents, Training, Feature Serving) selects the requirements that matter for that workload and sets thresholds appropriate to its stakes: a training pipeline needs near-perfect uniqueness; a portfolio triage does not.

Each requirement check runs real metadata SQL against your warehouse and returns the measured value plus the diagnostic numbers behind it. The report shows those numbers, so every score is traceable to facts like "15 of 31 columns documented".

Honest scoring: not assessable is an answer

Every requirement resolves to passed, failed, or not assessable. Not assessable has three causes: the requirement needs actual data values (all Clean checks; we deliberately never sample data), it lives in a governance catalog outside the warehouse, or the platform's catalog does not expose the signal.

Scores renormalize over what was measured. The report states how many requirements were measured, and the radar shows only measured factors. When a heavily weighted factor could not be measured at all, the maturity label is capped: we will not certify data as AI-ready while the factor that matters most is unverified. The verdict says exactly what still needs independent verification.

Maturity and the verdict

Maturity levels

Data-Unaware, Data-Aware, AI-Ready, AI-Optimized: a coarse label derived from the overall score and capped by the unmeasured-factor rule.

Verdict: ready

Safe for production use of the assessed workload.

Verdict: conditional

Safe for a contained pilot or limited scope only. The verdict names what is safe to ship today and the specific gaps gating production.

Verdict: not ready

Foundational gaps block even a pilot. The gaps are listed with their measured numbers.

AI recommendations

A large language model turns the scored results into a readiness narrative for a data leader: an executive summary, strengths, critical gaps, and prioritized recommendations with estimated score gains calibrated to the factor weights. Every claim must cite the measured numbers; generic advice is prohibited by the prompt. The model receives assessment results including schema identifiers (that is what makes recommendations name the exact column), never data values or credentials.

Re-scans of an unchanged schema reuse the previous analysis instead of regenerating it, and the report labels that reuse explicitly.

Estate view

With two or more assessed schemas (Pro and above), the estate view aggregates across domains: score distribution, systemic gaps that recur across schemas, PII spread, a per-business-function rollup, and a cross-domain data-model graph inferred from shared join keys. Inferred relationships are framed as hypotheses to confirm, never asserted facts.

Questions?

Email support@intellibricks.app. See also our security practices and the exact SQL we run.