Confidence Harness

Small fixtures, seeded failures, real-repo scale.

Veritas uses layered checks so changes can be tested quickly during development and measured against bigger real-world repositories when scale matters.

large scan758
inventory mutants555
source files52
languages3

Local Test Layers

Unit Tests

Parser helpers, report rendering, changed-target selection, cleanup, package scoping, mutation candidates, and artifact rendering.

Tiny Fixtures

Fast CLI smoke projects for Rust, Go, Python, and TypeScript/JavaScript keep plugin contracts honest.

Medium Fixtures

Rust workspaces and Go multimodule projects exercise package-local artifacts, reverse dependencies, fuzz discovery, and symbol graphs.

Seeded Examples

Purpose-built projects with hidden assumptions validate mutation, fuzzing, replay, assertions, and evolution behavior.

Benchmark Suites

veritas bench copies cases, runs verification, and scores expected findings, commands, artifacts, and thresholds.

External Canaries

Pinned public repos check real-world discovery and verification drift on manual or scheduled workflows.

Required Checks

cargo fmt --all -- --check
cargo test --workspace
cargo clippy --workspace --all-targets -- -D warnings
VERITAS_RELEASE_ALLOW_DIRTY=1 ./scripts/publish-crates.sh --dry-run

Large-Repo Benchmarks

Use this lane when the question is scale rather than fixture correctness.

./scripts/run-large-repo-benchmarks.py --manifest benchmarks/large-repos.toml --mode scan
./scripts/run-large-repo-benchmarks.py --manifest benchmarks/large-repos.toml --mode mutation-list
./scripts/run-large-repo-benchmarks.py --manifest benchmarks/large-repos.toml --mode mutation-inventory
./scripts/run-large-repo-benchmarks.py --manifest benchmarks/large-repos.toml --mode changed-only
./scripts/run-large-repo-benchmarks.py --manifest benchmarks/large-repos.toml --mode all

Scan

Measures Tree-sitter/project discovery and target counts across pinned Rust, Go, Python, and future TypeScript/JavaScript repositories.

Mutation List

Fast capped sample for the current target. This is the cheap smoke path.

Mutation Inventory

Walks discovered source files, previews bounded candidates per file, dedupes mutant IDs, and reports cap-hit paths plus domain/operator distribution.

Score And Anti-Gaming

veritas score rewards mutation score, property/fuzz/replay signal, assertion candidates, and budget health. It penalizes active findings, surviving correctness mutants, skipped commands, and timeouts.

veritas score --mode all includes strict and verified views so CI can reject confidence gained through ignored findings, skipped commands, or unpromoted generated proof.