Question 1

What are the four quadrants of the AI testing category?

Accepted Answer

Agentic E2E test authoring (QA Wolf, Momentic, testRigor), LLM unit-test generation (Diffblue Cover, Qodo, GitHub Copilot), self-healing locator maintenance (Mabl, Testim, Functionize, Rainforest QA), and visual-trace capture (Meticulous, Applitools Autonomous). As of 2026 the boundaries are blurring as mature tools add capabilities from adjacent quadrants.

Question 2

How is the 2026 agentic wave different from 2024 self-healing?

Accepted Answer

2024 self-healing was reactive: a test broke, the tool found an alternative locator and patched it. The 2026 agentic wave is proactive and compositional: tools like QA Wolf plan test strategy from a goal, write the Playwright code, run it, observe failures, and repair them without human input. The test artifact changes from a static script (maintained by humans) to a live plan (maintained by the agent).

Question 3

Which AI testing category is best for a startup?

Accepted Answer

For most startups with a Playwright-first stack, the fastest path is Copilot+MCP for authoring assistance plus Momentic for agentic E2E. Avoid enterprise-only tools (Mabl, Functionize) until your suite is mature enough to need auto-healing at scale. For JVM shops, Diffblue Cover's free IntelliJ plugin is a zero-cost starting point.

Question 4

Do I need all four categories or just one?

Accepted Answer

You need at most two, and often one. Unit-test generation and E2E test authoring are the two high-ROI investments. Self-healing is a maintenance layer that pays off at scale (50+ test files). Visual regression is a niche layer for UI-heavy products. Most teams start with unit-test gen or agentic E2E, not both.

Question 5

What is the Capgemini 63% figure?

Accepted Answer

Capgemini's 2025 World Quality Report surveyed enterprise engineering organisations and found 63% had adopted some form of AI-assisted QA tooling. Adoption was heaviest in unit-test generation (primarily Copilot) and lightest in fully agentic E2E. The figure is a broad 'any AI QA usage' measure, not a 'fully deployed at scale' measure.

Question 6

Is LLM-based test generation the same as RL-based?

Accepted Answer

No. LLM-based generation (Qodo, Copilot) uses a language model to produce test code from a prompt about the source. RL-based generation (Diffblue Cover) uses reinforcement learning to explore the code's execution paths, seed mutations, and evolve tests that kill the most mutants. RL-based is more accurate and slower; LLM-based is faster and more likely to hallucinate assertions. Both are valid for different use cases.

> what ai testers actually do

Agentic E2E Test Authors

LLM Unit-Test Generators

Self-Healing Locator Tools

Visual Trace and Vision AI