RAG vs TAMR+
Vector similarity treats "shall ensure compliance" and "may consider compliance" as identical. In regulatory AI, that's an audit failure. Here's what we built instead.
Architecture Comparison
Why Vector Similarity Fails for Regulation
Standard RAG retrieves by similarity. TAMR+ retrieves by regulatory structure — entity relationships, causal chains, and multi-hop traversal.
5 Things RAG Gets Wrong
Standard retrieval augmented generation was not designed for regulatory compliance.
"shall ensure" ≈ "may consider"
In embedding space, mandatory obligations and optional guidance are nearly identical vectors.
TAMR+ uses causal density signals to separate legal obligations from discretionary guidance.
Single-Hop Retrieval
Regulations reference other regulations. Article 9 requires Article 13 transparency. RAG finds one chunk.
TAMR+ follows the chain — up to 3 hops with decay weighting [1.0, 0.5, 0.25].
Opaque Scores
RAG gives you an answer with no confidence signal. How much should you trust it?
TAMR+ provides 5-dimension TRACE scores mapped to specific EU AI Act articles.
No Audit Trail
Article 51 requires technical documentation. RAG pipelines are stateless — no lineage.
TAMR+ produces SHA-256 chained audit trails for every retrieval decision.
Expensive at Scale
Most RAG systems cost $0.50-12.00 per complex query with LLM calls at every stage.
TAMR+ runs at $0.03/workspace. Zero LLM calls during retrieval. 207ms latency.
Benchmark Results
250 regulatory questions across 4 domains. Open-source. Apache 2.0.
| System | EU-RegQA | MedRegQA | FinRegQA | CrimNet | Avg |
|---|---|---|---|---|---|
| TAMR+ v2.3 (3-hop) | 0.74 | 0.69 | 0.66 | 0.63 | 0.680 |
| TAMR+ v2.3 (1-hop) | 0.67 | 0.63 | 0.61 | 0.59 | 0.625 |
| GraphCompliance | 0.554 | — | — | — | 0.554 |
| Vector-only RAG | 0.41 | 0.38 | 0.39 | 0.36 | 0.385 |
Ablation: removing any single component degrades performance by 6-27%. Vector-only scores 38.8% below full pipeline (p<0.001).
Ablation: removing any single component degrades performance by 6-27%. Vector-only scores 38.8% below full pipeline (p<0.001).
The Gap Is the Product
A 67% with full gap attribution tells a compliance officer exactly what to fix. A 95% with no explanation tells them nothing.
Small document workspace. Fix: add more regulatory sources.
LLM fills gaps from training data. Fix: domain-specific documents.
Formatting over evidence. Fix: improve citation density.
Regulatory vocabulary precision. Fix: glossary expansion.
Irreducible system floor (3% system-wide). Disclosed per Art. 13.
Framework Comparison
How TAMR+ compares to existing evaluation and retrieval frameworks.
| Feature | TAMR+ | RAGAS | DeepEval | COMPL-AI | GraphComp. |
|---|---|---|---|---|---|
| Gap Attribution | 5 categories | No | No | No | No |
| Predictive Gaps | Yes | No | No | No | No |
| Formula-Based (No ML) | Yes | No | No | Partial | Partial |
| EU AI Act Mapping | 8/8 articles | 0/8 | 0/8 | 3/8 | 0/8 |
| Cross-Domain | 4 domains | N/A | N/A | 1 | 1 |
| Audit Trail | Yes (Art. 51) | No | No | No | No |
| Production Deployed | Yes | N/A | N/A | No | No |
| Open Benchmark | 250 Qs | No | No | No | No |
Publication Status
136days until enforcement
Run Your System Against Our Benchmark
250 regulatory questions, 4 domains, Apache 2.0. Download the benchmark, run your RAG system, and compare.