Skip to content

fix: make local coverage runs reliable#2236

Merged
maxisbey merged 1 commit intomainfrom
fix/coverage-local-reliability
Mar 6, 2026
Merged

fix: make local coverage runs reliable#2236
maxisbey merged 1 commit intomainfrom
fix/coverage-local-reliability

Conversation

@maxisbey
Copy link
Contributor

@maxisbey maxisbey commented Mar 6, 2026

Running ./scripts/test locally was unreliable — stale state, a flaky test, and incorrect docs combined to make it fail often enough that local coverage checks were untrustworthy.

Motivation and Context

Three distinct issues were compounding:

1. scripts/test didn't erase before running. concurrency=['multiprocessing'] in pyproject.toml implicitly enables parallel=True, so each run produces ~70 .coverage.* files. If a previous run aborted before coverage combine (flaky test + set -e), those stale files contaminated the next run. CI never saw this because each job starts in a fresh container.

2. CLAUDE.md said to verify coverage with pytest -x. That command doesn't collect coverage — pytest-cov isn't installed and coverage isn't wired into addopts. The instruction silently did nothing, so coverage issues only surfaced on CI.

3. test_desktop flaked under -n auto. It monkeypatched Path.iterdir at the class level. When the test landed on an xdist worker that hadn't yet imported jsonschema, the lazy import in _validate_tool_result triggered jsonschema_specifications' schema discovery — which walks a directory with iterdir() and got the fake paths back. Flaky because it depended on whether another test in the same worker had already triggered the import.

How Has This Been Tested?

  • ./scripts/test run 3× back-to-back: 100% each time (previously flaked ~25% on re-runs due to the test_desktop + stale-file interaction)
  • test_desktop verified to pass with a cold jsonschema import
  • Targeted-check recipe from CLAUDE.md verified against a single module

Breaking Changes

None.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling — n/a
  • I have added or updated documentation as needed

Additional context

Also added coverage erase to .github/workflows/shared.yml — a no-op on CI (fresh container), but keeps the command sequence identical to scripts/test so it's safe to copy-paste locally.

The test_desktop fix also drops the Windows/POSIX path-separator special-casing — with real files under tmp_path, str(f) uses the platform's native separator naturally, so a substring match on the bare filename works everywhere.

AI Disclaimer

Three fixes for local coverage reliability:

1. scripts/test: add `coverage erase` before `coverage run`. The config
   implicitly enables parallel=True (via concurrency=['multiprocessing']),
   so each run produces ~70 .coverage.* files. If a prior run aborted
   before `coverage combine` (e.g. flaky test + set -e), those stale
   files get picked up. Erase guarantees a clean slate.

2. CLAUDE.md: the previous instruction (`pytest -x` to verify coverage)
   never worked — pytest-cov isn't installed and coverage isn't wired
   into pytest addopts. Replace with the actual commands, plus a
   targeted-check recipe for iterating on a single module. Notes that
   partial runs can't hit 100% because source includes tests/.

3. test_desktop: was flaky under -n auto when it landed on an xdist
   worker that hadn't yet imported jsonschema. The class-level
   Path.iterdir monkeypatch broke jsonschema_specifications' import-time
   schema discovery (which walks a directory). Use a real tmp_path
   Desktop directory instead of patching iterdir.

Also adds `coverage erase` to the CI workflow so the command sequence
there stays copy-pasteable into a local shell.
@maxisbey maxisbey marked this pull request as ready for review March 6, 2026 15:40
@maxisbey maxisbey merged commit 7ba41dc into main Mar 6, 2026
31 checks passed
@maxisbey maxisbey deleted the fix/coverage-local-reliability branch March 6, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants