> verification log
Transparency log. Every price, benchmark, and verdict change on this site -- when we verified it and what changed. We do not remove unfavourable results; we add context.
Correction policy
If a vendor believes we have published incorrect pricing or feature data, we publish their response verbatim alongside our original data. We do not remove unfavourable results -- we add context. Benchmarks are re-run quarterly; pricing is re-verified monthly; comparison verdicts are reviewed quarterly.
> recheck schedule
Monthly
Pricing data on all tools (12 tools x 6 fields)
Quarterly
Benchmark re-runs, comparison verdicts, feature matrix
Bi-annually
Category overview, tool taxonomy
> wave 1 shipped (2026-04-21)
16 pages
+ /
+ /category-overview
+ /tool-comparison
+ /pricing-comparison
+ /benchmarks
+ /llm-test-automation
+ /playwright-ai
+ /unit-test-generation
+ /self-healing-tests
+ /ai-qa
+ /test-case-generation
+ /compare/testrigor-vs-mabl
+ /compare/testrigor-vs-functionize
+ /compare/mabl-vs-testim
+ /compare/meticulous-vs-momentic
+ /compare/diffblue-alternatives
+ /compare/diffblue-vs-copilot
+ /compare/qa-wolf-vs-mabl
+ /faq
+ /log
Deferred to wave 2:
-- /cypress-ai
-- /selenium-ai
-- /e2e-test-ai
-- /ci-integration
-- /glossary
-- /pricing-comparison (expanded deep-dives)
-- Additional compare pages if warranted
> change log
2026-04-21/Initial publish. Homepage with seven-tool benchmark strip (preliminary data), tool-by-job matrix (36 cells), decision flow, and FAQ.
2026-04-21/tool-comparisonInitial publish. 12-tool feature matrix, lock-in scorecard, per-tool verdict cards. Data sourced from public documentation and vendor trial access.
2026-04-21/pricing-comparisonInitial publish. Pricing data sourced from public pricing pages, G2 reports, and customer discussions. Mabl and Momentic estimated from industry sources (no public pricing). All data flagged as April 2026.
2026-04-21/benchmarksInitial publish. Methodology complete. Early Diffblue Cover results on spring-petclinic-rest (91% mutation score) and Qodo results on django-oscar (76% mutation score). QA Wolf, testRigor, Momentic evaluations in progress. Full results expected late April 2026.
2026-04-21/category-overviewInitial publish. Four-quadrant taxonomy, historical timeline (2022-2026).
2026-04-21/llm-test-automationInitial publish. Five-level capability ladder (original framing). Tool-to-level mapping table. Playwright MCP deep-dive.
2026-04-21/playwright-aiInitial publish. Stack anatomy (Copilot + MCP + Healer + TestDino). Setup walkthrough. Failure modes.
2026-04-21/unit-test-generationInitial publish. Three-paradigm comparison (RL vs LLM vs hybrid). Evaluation methodology. Academic ceiling references (MuTAP 93.57%, MutGen 89.5%).
2026-04-21/self-healing-testsInitial publish. Three-identifier model. Six-tool verdict cards.
2026-04-21/ai-qaInitial publish. Five AI-touched QA workflows. What AI cannot do in QA.
2026-04-21/test-case-generationInitial publish. Eight-tool rundown. Input type matrix. Common failure modes.
2026-04-21/compare/testrigor-vs-mablInitial publish. Eight-row feature delta table. Verdict.
2026-04-21/compare/testrigor-vs-functionizeInitial publish. Verdict: testRigor wins for new evaluations.
2026-04-21/compare/mabl-vs-testimInitial publish. Verdict: Mabl for standalone, Testim if already in Tricentis ecosystem.
2026-04-21/compare/meticulous-vs-momenticInitial publish. Clarification that these tools solve different problems.
2026-04-21/compare/diffblue-alternativesInitial publish. Five-tool alternative matrix with mutation scores.
2026-04-21/compare/diffblue-vs-copilotInitial publish. Benchmark data: Diffblue 91% vs Copilot 74% mutation score on JVM.
2026-04-21/compare/qa-wolf-vs-mablInitial publish. Lock-in scorecard comparison (QA Wolf 5/5, Mabl 3/5). Verdict.
2026-04-21/faqInitial publish. 27 questions across 5 sections.
2026-04-21/logThis page. Wave 1 launch: 16 pages shipped (homepage + 4 core pillars + 7 topic pages + 7 compare pages - the 3 deferred compare pages + faq + log).