Merged
Conversation
Three fixes for local coverage reliability: 1. scripts/test: add `coverage erase` before `coverage run`. The config implicitly enables parallel=True (via concurrency=['multiprocessing']), so each run produces ~70 .coverage.* files. If a prior run aborted before `coverage combine` (e.g. flaky test + set -e), those stale files get picked up. Erase guarantees a clean slate. 2. CLAUDE.md: the previous instruction (`pytest -x` to verify coverage) never worked — pytest-cov isn't installed and coverage isn't wired into pytest addopts. Replace with the actual commands, plus a targeted-check recipe for iterating on a single module. Notes that partial runs can't hit 100% because source includes tests/. 3. test_desktop: was flaky under -n auto when it landed on an xdist worker that hadn't yet imported jsonschema. The class-level Path.iterdir monkeypatch broke jsonschema_specifications' import-time schema discovery (which walks a directory). Use a real tmp_path Desktop directory instead of patching iterdir. Also adds `coverage erase` to the CI workflow so the command sequence there stays copy-pasteable into a local shell.
felixweinberger
approved these changes
Mar 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Running
./scripts/testlocally was unreliable — stale state, a flaky test, and incorrect docs combined to make it fail often enough that local coverage checks were untrustworthy.Motivation and Context
Three distinct issues were compounding:
1.
scripts/testdidn't erase before running.concurrency=['multiprocessing']inpyproject.tomlimplicitly enablesparallel=True, so each run produces ~70.coverage.*files. If a previous run aborted beforecoverage combine(flaky test +set -e), those stale files contaminated the next run. CI never saw this because each job starts in a fresh container.2.
CLAUDE.mdsaid to verify coverage withpytest -x. That command doesn't collect coverage —pytest-covisn't installed and coverage isn't wired intoaddopts. The instruction silently did nothing, so coverage issues only surfaced on CI.3.
test_desktopflaked under-n auto. It monkeypatchedPath.iterdirat the class level. When the test landed on an xdist worker that hadn't yet importedjsonschema, the lazy import in_validate_tool_resulttriggeredjsonschema_specifications' schema discovery — which walks a directory withiterdir()and got the fake paths back. Flaky because it depended on whether another test in the same worker had already triggered the import.How Has This Been Tested?
./scripts/testrun 3× back-to-back: 100% each time (previously flaked ~25% on re-runs due to thetest_desktop+ stale-file interaction)test_desktopverified to pass with a coldjsonschemaimportBreaking Changes
None.
Types of changes
Checklist
Additional context
Also added
coverage eraseto.github/workflows/shared.yml— a no-op on CI (fresh container), but keeps the command sequence identical toscripts/testso it's safe to copy-paste locally.The
test_desktopfix also drops the Windows/POSIX path-separator special-casing — with real files undertmp_path,str(f)uses the platform's native separator naturally, so a substring match on the bare filename works everywhere.AI Disclaimer