Mutation Testing
For writing good tests (factories, behavior-driven patterns), load the testing skill. This skill focuses on verifying test effectiveness.
Mutation testing answers the question: “Are my tests actually catching bugs?”
Code coverage tells you what code your tests execute. Mutation testing tells you if your tests would detect changes to that code. A test suite with 100% coverage can still miss 40% of potential bugs.
Default posture: use an automated mutation harness first. For JavaScript and TypeScript projects, recommend Stryker as the starting point if it is not already set up. Use manual/mental mutations only as a fallback, a teaching aid, or a focused follow-up for subtle survivors.
Deep-dive resources are in the resources/ directory. Load them on demand:
| Resource | Load when… |
|---|---|
mutator-rules.md | Planning tests, scanning changed code for likely gaps, manually applying mutations, or interpreting surviving/equivalent mutants |
Core Concept
The Mutation Testing Process:
- Generate mutants: Introduce small bugs (mutations) into production code
- Run tests: Execute your test suite against each mutant
- Evaluate results: If tests fail, the mutant is “killed” (good). If tests pass, the mutant “survived” (bad - your tests missed the bug)
The Insight: A surviving mutant represents a bug your tests wouldn’t catch.
When to Use This Skill
Use mutation testing analysis when:
- Reviewing code changes on a branch
- Verifying test effectiveness after TDD
- Identifying weak tests that appear to have coverage
- Finding missing edge case tests
- Validating that refactoring didn’t weaken test suite
Integration with planning and TDD:
FOR EACH STEP:
├─► CONFIRM: Human approves observable acceptance criteria
├─► RED: Write failing test, using mutator rules to spot likely gaps
├─► GREEN: Make it pass
├─► Run mutation testing
├─► KILL MUTANTS: Strengthen tests for worthwhile survivors
├─► REFACTOR: If valuable
└─► STOP: Present work, mutation report, and wait for commit approval
PRE-PR QUALITY GATE:
└─► Re-run mutation testing for the branch/repo scope
Mutation testing is not a replacement for RED-GREEN-MUTATE-KILL MUTANTS-REFACTOR. It verifies the tests created during those increments are strong enough to catch real behavioral regressions before refactoring and before PR.
Harness-First Mutation Workflow
When analyzing code on a branch, prove test effectiveness with Stryker whenever practical. Do not stop at reasoning about whether a test would catch a mutation; run the harness, then use the report to drive focused test improvements.
Step 1: Inspect Setup and Scope
rg --files | rg '(^|/)(package.json|stryker\.config\.(mjs|cjs|js|json)|stryker\.conf\.(js|json))$'
git diff main...HEAD --name-only
- Identify the package manager, test runner, affected package(s), and existing Stryker config.
- If the repo uses a base branch other than
main, substitute that branch in all diff commands. - In monorepos, start in the smallest affected package, then widen to the repo-level command when the targeted run is healthy.
- If no Stryker setup exists in a JS/TS project, recommend adding it before doing manual mutation analysis.
Step 2: Set Up Stryker When Missing
Use the official initializer as the starting point:
npm init stryker@latest
Then inspect and adapt the generated stryker.config.*:
- Prefer the project test runner plugin when available (
vitest,jest,mocha, etc.). Use the generic command runner only when no tighter integration is practical. - Mutate first-party production source only. Exclude tests, fixtures, snapshots, generated files, declaration files, build outputs, migrations, and low-signal barrels.
- For TypeScript, consider
@stryker-mutator/typescript-checkerso type-invalid mutants are reported as compile errors instead of wasting test time. - Keep setup changes reviewable: add dependencies, config, scripts, and
.gitignoreentries for Stryker temp/report output only when the project needs them.
Step 3: Recommend Useful Commands
Suggest project scripts for full-project, cached, and branch-diff mutation runs:
{
"scripts": {
"mutation": "stryker run",
"mutation:incremental": "stryker run --incremental",
"mutation:diff": "node scripts/stryker-diff.mjs main"
}
}
The mutation:diff helper should:
- Read the base branch argument, defaulting to
main. - Collect changed files with
git diff --name-only --diff-filter=ACMRTUXB <base>...HEAD. - Keep changed production files matching the project’s source extensions.
- Exclude test/spec files, fixtures, snapshots, generated files, declaration files, and build output.
- Run
stryker run --incremental --force --mutate <comma-separated-files>. - Exit clearly when there are no changed production files to mutate.
Prefer a small Node helper over dense shell inside package.json; quoting *, !, and command substitution is fragile across shells. For quick local use, this POSIX one-liner is acceptable:
CHANGED=$(git diff --name-only --diff-filter=ACMRTUXB main...HEAD -- '*.ts' '*.tsx' '*.js' '*.jsx' | grep -Ev '(^|/)(__tests__|test|tests|fixtures|generated)/|\.(test|spec|d)\.' | paste -sd, -)
test -n "$CHANGED" && npx stryker run --incremental --force --mutate "$CHANGED"
Use exact line ranges for tiny follow-up checks when the report points to a specific survivor:
npx stryker run --incremental --force --mutate src/example.ts:42-57
Step 4: Run and Triage
Start with mutation:diff for branch feedback. Run mutation across the full project when introducing Stryker, changing shared test infrastructure, preparing CI gates, or validating a broad test-strengthening pass.
Categorize Stryker findings:
| Category | Description | Action Required |
|---|---|---|
| Killed | Test failed when mutant was applied | None - tests are effective |
| Survived | Tests passed with mutant active | Add/strengthen test, unless equivalent |
| No Coverage | No test exercises this code | Add behavior test |
| Equivalent | Mutant produces same behavior | None - not a real bug |
Fix obvious issues immediately:
- Missing boundary tests
- Weak or absent assertions
- One-sided branch coverage
- Missing side-effect verification
- High-value business rules such as money, permissions, eligibility, safety, or data loss
Use the harness’s ask-question facility for subtle survivors that require human judgment. Ask one concise question with concrete choices, explain the mutation, and describe the tradeoff. Use this when behavior is intentionally unspecified, the correct domain rule is unclear, the test would be expensive or brittle, or the mutant may be equivalent but you are not certain.
Step 5: Kill Survivors With TDD
For each survivor worth killing:
- Keep or recreate the mutant.
- Write the smallest behavior test that fails against the mutant for the right reason.
- Restore the original production code.
- Verify the new test passes.
- Re-run Stryker scoped to the mutated file or line range, then re-run the diff command.
Avoid overfitting tests to implementation details. Strong mutation tests assert observable behavior: return values, persisted state, emitted events, permissions, messages, or meaningful collaborator calls.
Stryker Configuration Guidance
Stryker should be the normal entry point for JS/TS mutation testing.
Starting Configuration
Prefer stryker.config.mjs or the format generated by the initializer. A typical starting point:
export default {
testRunner: "vitest",
coverageAnalysis: "perTest",
reporters: ["html", "clear-text", "progress"],
mutate: [
"src/**/*.{ts,tsx,js,jsx}",
"!src/**/*.test.{ts,tsx,js,jsx}",
"!src/**/*.spec.{ts,tsx,js,jsx}",
"!src/**/*.d.ts"
]
}
Adapt testRunner, mutate, testFiles, build commands, and checker plugins to match the project. Do not cargo-cult this exact config into a repo with a different layout.
CI and Quality Gates
- Start with report-only or diff-only mutation checks if the existing suite has many survivors.
- Add failing thresholds only after establishing a realistic baseline.
- Persist HTML and clear-text reports as CI artifacts.
- Use incremental mode for fast local feedback, but periodically force a full run to avoid stale assumptions.
- Treat mutation score as a signal, not a vanity metric. Prioritize surviving/no-coverage mutants in changed and high-risk code.
Manual Mutation Fallback
If Stryker is unavailable or cannot target the code under review, load resources/mutator-rules.md and manually apply the relevant operators. Always revert each mutation before the next one. Manual mutation should still follow the same loop: mutate, run tests, classify, fix obvious gaps, ask about judgment calls, and report the result.
Summary: Mutation Testing Mindset
The key question for every line of code:
“If I introduced a bug here, would my tests catch it?”
For each test, verify it would catch:
- Arithmetic operator changes
- Boundary condition shifts
- Boolean logic inversions
- Removed statements
- Changed return values
Remember:
- Coverage measures execution, mutation testing measures detection
- A test that doesn’t make assertions can’t kill mutants
- Boundary values, mixed boolean cases, non-identity values, and observable side effects kill many common mutants
- For the full mutator checklist and examples, load
resources/mutator-rules.md