PMAT
Zero-configuration AI context generation for any codebase
What is PMAT?
PMAT (Pragmatic Multi-language Agent Toolkit) provides everything needed to analyze code quality and generate AI-ready context:
- Context Generation - Deep analysis for Claude, GPT, and other LLMs
- Technical Debt Grading - A+ through F scoring with 6 orthogonal metrics
- Mutation Testing - Test suite quality validation (85%+ kill rate)
- Repository Scoring - Quantitative health assessment (0-211 scale)
- Git History RAG - Semantic search across commit history with RRF fusion
- Semantic Search - Natural language code discovery
- Compliance Checks - CB-500 Rust and CB-600 Lua best practices detection
- MCP Integration - 19 tools for Claude Code, Cline, and AI agents
- Quality Gates - Pre-commit hooks, CI/CD integration
- 18+ Languages - Rust, TypeScript, Python, Go, Java, C/C++, Lua, and more
Part of the PAIML Stack, following Toyota Way quality principles (Jidoka, Genchi Genbutsu, Kaizen).
Annotated Code Search
pmat query "cache invalidation" --churn --duplicates --entropy --faults
Every result includes TDG grade, Big-O complexity, git churn, code clones, pattern diversity, fault annotations, call graph, and syntax-highlighted source.
Getting Started
Add to your system:
# Install from crates.io
cargo install pmat
# Or from source (latest)
git clone https://github.com/paiml/paiml-mcp-agent-toolkit
cd paiml-mcp-agent-toolkit && cargo install --path server
Basic Usage
# Generate AI-ready context
pmat context --output context.md --format llm-optimized
# Analyze code complexity
pmat analyze complexity
# Grade technical debt (A+ through F)
pmat analyze tdg
# Score repository health
pmat repo-score .
# Run mutation testing
pmat mutate --target src/
MCP Server Mode
# Start MCP server for Claude Code, Cline, etc.
pmat mcp
Features
Context Generation
Generate comprehensive context for AI assistants:
pmat context # Basic analysis
pmat context --format llm-optimized # AI-optimized output
pmat context --include-tests # Include test files
Technical Debt Grading (TDG)
Six orthogonal metrics for accurate quality assessment:
pmat analyze tdg # Project-wide grade
pmat analyze tdg --include-components # Per-component breakdown
pmat tdg baseline create # Create quality baseline
pmat tdg check-regression # Detect quality degradation
Grading Scale:
- A+/A: Excellent quality, minimal debt
- B+/B: Good quality, manageable debt
- C+/C: Needs improvement
- D/F: Significant technical debt
Mutation Testing
Validate test suite effectiveness:
pmat mutate --target src/lib.rs # Single file
pmat mutate --target src/ --threshold 85 # Quality gate
pmat mutate --failures-only # CI optimization
Supported Languages: Rust, Python, TypeScript, JavaScript, Go, C++, Lua, Java, Kotlin, Ruby, Swift, C, SQL, Scala, YAML, Markdown + MLOps model formats (GGUF, SafeTensors, APR)
Repository Health Scoring
Evidence-based quality metrics (0-211 scale):
pmat rust-project-score # Fast mode (~3 min)
pmat rust-project-score --full # Comprehensive (~10-15 min)
pmat repo-score . --deep # Full git history
Workflow Prompts
Pre-configured AI prompts enforcing EXTREME TDD:
pmat prompt --list # Available prompts
pmat prompt code-coverage # 85%+ coverage enforcement
pmat prompt debug # Five Whys analysis
pmat prompt quality-enforcement # All quality gates
Git History RAG
Search git history by intent using TF-IDF semantic embeddings:
# Fuse git history into code search
pmat query "fix memory leak" -G
# Search with churn, clones, entropy, faults
pmat query "error handling" --churn --duplicates --entropy --faults
# Run the example
cargo run --example git_history_demo
Git Hooks
Automatic quality enforcement:
pmat hooks install # Install pre-commit hooks
pmat hooks install --tdg-enforcement # With TDG quality gates
pmat hooks status # Check hook status
Examples
Generate Context for AI
# For Claude Code
pmat context --output context.md --format llm-optimized
# With semantic search
pmat embed sync ./src
pmat semantic search "error handling patterns"
CI/CD Integration
# .github/workflows/quality.yml
name: Quality Gates
on: [push, pull_request]
jobs:
quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: cargo install pmat
- run: pmat analyze tdg --fail-on-violation --min-grade B
- run: pmat mutate --target src/ --threshold 80
Quality Baseline Workflow
# 1. Create baseline
pmat tdg baseline create --output .pmat/baseline.json
# 2. Check for regressions
pmat tdg check-regression \\
--baseline .pmat/baseline.json \\
--max-score-drop 5.0 \\
--fail-on-regression
Architecture
pmat/
├── server/ CLI and MCP server
│ ├── src/
│ │ ├── cli/ Command handlers
│ │ ├── services/ Analysis engines
│ │ ├── mcp/ MCP protocol
│ │ └── tdg/ Technical Debt Grading
├── crates/
│ └── pmat-dashboard/ Pure WASM dashboard
└── docs/
└── specifications/ Technical specs
Quality
| Metric | Value |
|---|---|
| Tests | 20,700+ passing |
| Coverage | 99.66% |
| Mutation Score | >80% |
| Languages | 22+ supported (incl. SQL, Scala, YAML, Markdown, MLOps models) |
| MCP Tools | 19 available |
Falsifiable Quality Commitments
Per Popper's demarcation criterion, all claims are measurable and testable:
| Commitment | Threshold | Verification Method |
|---|---|---|
| Context Generation | < 5 seconds for 10K LOC project | time pmat context on test corpus |
| Memory Usage | < 500 MB for 100K LOC analysis | Measured via heaptrack in CI |
| Test Coverage | ≥ 85% line coverage | cargo llvm-cov (CI enforced) |
| Mutation Score | ≥ 80% killed mutants | pmat mutate --threshold 80 |
| Build Time | < 3 minutes incremental | cargo build --timings |
| CI Pipeline | < 15 minutes total | GitHub Actions workflow timing |
| Binary Size | < 50 MB release binary | ls -lh target/release/pmat |
| Language Parsers | All 22+ languages parse without panic | Fuzz testing in CI |
How to Verify:
# Run self-assessment with Popper Falsifiability Score
pmat popper-score --verbose
# Individual commitment verification
cargo llvm-cov --html # Coverage ≥85%
pmat mutate --threshold 80 # Mutation ≥80%
cargo build --timings # Build time <3min
Failure = Regression: Any commitment violation blocks CI merge.
Benchmark Results (Statistical Rigor)
All benchmarks use Criterion.rs with proper statistical methodology:
| Operation | Mean | 95% CI | Std Dev | Sample Size |
|---|---|---|---|---|
| Context (1K LOC) | 127ms | [124, 130] | ±12.3ms | n=1000 runs |
| Context (10K LOC) | 1.84s | [1.79, 1.90] | ±156ms | n=500 runs |
| TDG Scoring | 156ms | [148, 164] | ±18.2ms | n=500 runs |
| Complexity Analysis | 23ms | [22, 24] | ±3.1ms | n=1000 runs |
Comparison Baselines (vs. Alternatives):
| Metric | PMAT | ctags | tree-sitter | Effect Size |
|---|---|---|---|---|
| 10K LOC parsing | 1.84s | 0.3s | 0.8s | d=0.72 (medium) |
| Memory (10K LOC) | 287MB | 45MB | 120MB | - |
| Semantic depth | Full | Syntax only | AST only | - |
See docs/BENCHMARKS.md for complete statistical analysis.
ML/AI Reproducibility
PMAT uses ML for semantic search and embeddings. All ML operations are reproducible:
Random Seed Management:
- Embedding generation uses fixed seed (SEED=42) for deterministic outputs
- Clustering operations use fixed seed (SEED=12345)
- Seeds documented in docs/ml/REPRODUCIBILITY.md
Model Artifacts:
- Pre-trained models from HuggingFace (all-MiniLM-L6-v2)
- Model versions pinned in Cargo.toml
- Hash verification on download
Dataset Sources
PMAT does not train models but uses these data sources for evaluation:
| Dataset | Source | Purpose | Size |
|---|---|---|---|
| CodeSearchNet | GitHub/Microsoft | Semantic search benchmarks | 2M functions |
| PMAT-bench | Internal | Regression testing | 500 queries |
Data provenance and licensing documented in docs/ml/REPRODUCIBILITY.md.
Sovereign Stack
PMAT is built on the PAIML Sovereign Stack - pure-Rust, SIMD-accelerated libraries:
| Library | Purpose | Version |
|---|---|---|
| aprender | ML library (text similarity, clustering, topic modeling) | 0.25.4 |
| trueno | SIMD compute library for matrix operations | 0.14.5 |
| trueno-graph | GPU-first graph database (PageRank, Louvain, CSR) | 0.1.14 |
| trueno-rag | RAG pipeline with VectorStore | 0.1.12 |
| trueno-db | Embedded analytics database | 0.3.13 |
| trueno-viz | Terminal graph visualization | 0.1.23 |
| trueno-zram-core | SIMD LZ4/ZSTD compression (optional) | 0.3.0 |
| pmat | Code analysis toolkit | 3.0.7 |
Key Benefits:
- Pure Rust (no C dependencies, no FFI)
- SIMD-first (AVX2, AVX-512, NEON auto-detection)
- 2-4x speedup on graph algorithms via aprender adapter
Documentation
- PMAT Book - Complete guide
- API Reference - Rust API docs
- MCP Tools - MCP integration guide
- Specifications - Technical specs
- 🤖 Coursera Hugging Face AI Development Specialization - Build Production AI systems with Hugging Face in Pure Rust
License
MIT License - see LICENSE for details.
Recommend MCP Servers 💡
mcp-simple-openai-assistant
MCP server that gives Claude ability to use OpenAI's GPTs assistants
@startreedata/mcp-pinot
A Python-based Model Context Protocol (MCP) server for interacting with Apache Pinot, enabling real-time analytics and metadata queries, designed for integration with Claude Desktop.
raj-mehra/figma-mcp
An MCP server for interacting with Figma design files to fetch tokens and components, compatible with Cursor IDE.
code-assistant
An LLM-powered, autonomous coding assistant. Also offers an MCP and ACP mode.
github-graphql-mcp-server
A Model Context Protocol (MCP) server that provides access to GitHub's GraphQL API. This server exposes a single tool that allows executing arbitrary GraphQL queries and mutations against GitHub's API.
tldv
Provides a Model Context Protocol (MCP) server for seamless interaction with the tl;dv API, enabling AI models and MCP clients to access, analyze, and derive insights from meetings across Google Meet, Zoom, and Microsoft Teams.