-
Notifications
You must be signed in to change notification settings - Fork 205
Open
Labels
enhancementNew feature or requestNew feature or requestinfrastructureCI/CD, build system, testing framework, toolingCI/CD, build system, testing framework, tooling
Description
Overview
Integrate the benchmarking infrastructure (#542) into the CI/CD pipeline to automatically detect performance regressions and include benchmark reports as release artifacts.
Parent Issue
This is a sub-issue of #542 (Benchmarking Infrastructure).
Proposal
1. Benchmark Stage on Main Branch
Add a GitHub Actions workflow that runs on every push/merge to main:
name: Performance Regression Check
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
jobs:
benchmark:
runs-on: ubuntu-latest # or windows-latest for consistency
steps:
- uses: actions/checkout@v4
- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: '10.0.x'
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install NumPy
run: pip install numpy tabulate
- name: Run Benchmarks
run: |
cd benchmark
pwsh -File run-benchmarks.ps1 -Quick
- name: Upload Benchmark Report
uses: actions/upload-artifact@v4
with:
name: benchmark-report
path: benchmark/benchmark-report.md
- name: Compare with Baseline
run: |
# Compare current results with stored baseline
# Alert if regression > 10%2. Release Artifacts
Include benchmark reports in GitHub Releases:
- name: Attach Benchmark Report to Release
uses: softprops/action-gh-release@v1
with:
files: |
benchmark/benchmark-report.md
benchmark/NumSharp.Benchmark.GraphEngine/BenchmarkDotNet.Artifacts/results/*.html3. Regression Detection
Implement baseline comparison:
- Store baseline results in
benchmark/baseline.json - Compare current run against baseline
- Fail CI if regression > 10% on key metrics
- Auto-update baseline on release tags
# Add to run-benchmarks.ps1
param(
[switch]$CompareBaseline,
[string]$BaselinePath = 'baseline.json',
[int]$RegressionThreshold = 10 # percent
)
if ($CompareBaseline) {
$baseline = Get-Content $BaselinePath | ConvertFrom-Json
$regressions = @()
foreach ($metric in $current.Keys) {
$change = (($current[$metric] - $baseline[$metric]) / $baseline[$metric]) * 100
if ($change -gt $RegressionThreshold) {
$regressions += "$metric regressed by $([math]::Round($change, 1))%"
}
}
if ($regressions.Count -gt 0) {
Write-Error "Performance regressions detected:`n$($regressions -join "`n")"
exit 1
}
}4. PR Comments
Post benchmark results as PR comments:
- name: Comment PR with Benchmark Results
uses: actions/github-script@v7
if: github.event_name == 'pull_request'
with:
script: |
const fs = require('fs');
const report = fs.readFileSync('benchmark/benchmark-report.md', 'utf8');
// Extract summary section
const summary = report.match(/## Key Comparisons[\s\S]*?(?=##|$)/)[0];
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## 📊 Benchmark Results\n\n${summary}\n\n<details><summary>Full Report</summary>\n\n${report}\n</details>`
});Implementation Checklist
- Create
.github/workflows/benchmark.yml - Add benchmark step to release workflow
- Create
benchmark/baseline.jsonwith current results - Add
--compare-baselineflag torun-benchmarks.ps1 - Add regression threshold configuration
- Add PR comment integration
- Document the CI workflow in benchmark README
Key Metrics to Track
| Metric | Baseline | Threshold |
|---|---|---|
| np.add (int32, N=10M) | TBD | ±10% |
| aa + 2b (float64, N=10M) | TBD | ±10% |
| np.var (float64, N=10M) | TBD | ±10% |
| Memory allocation per op | TBD | ±20% |
Benefits
- Automated regression detection - Catch performance issues before release
- Historical tracking - See performance trends over time
- Release documentation - Users can see benchmark results per version
- PR feedback - Contributors see impact of their changes
- Data-driven decisions - Objective metrics for optimization PRs
Related
- Benchmarking Infrastructure: Industry-standard performance testing framework #542 - Benchmarking Infrastructure (parent)
- [Core] GraphEngine: Replace ~636K lines of generated code with DynamicMethod IL emission #541 - GraphEngine proposal
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestinfrastructureCI/CD, build system, testing framework, toolingCI/CD, build system, testing framework, tooling