feat: complete token usage tracking across agent target, LLM judges, and code judge proxy by christso · Pull Request #390 · EntityProcess/agentv

christso · 2026-02-26T08:31:52Z

Summary

Map AI SDK usage (inputTokens/outputTokens) to ProviderTokenUsage in mapResponse — fixes Azure, Anthropic, and Gemini providers
Add optional tokenUsage field to EvaluatorResult and EvaluationScore types
Capture token usage from LLM judge generateText() and provider.invoke() calls
Accumulate token usage across target proxy invocations with per-call reporting
Surface proxy tokenUsage in code evaluator results
Extend TargetInvokeResponse with per-call tokenUsage for code judge scripts
Pass tokenUsage through orchestrator EvaluationScore → EvaluatorResult mapping

Test plan

Unit tests pass (985 tests, 0 failures)
TypeScript typecheck passes
e2e: bun agentv eval examples/features/basic/evals/dataset.eval.yaml --test-id feature-proposal-brainstorm — verify trace.token_usage in JSONL
e2e: Verify scores[].token_usage on LLM judge entries in JSONL
e2e: Run a code judge example with target proxy and verify token_usage on scores entry

Closes #387

🤖 Generated with Claude Code

…and code judge proxy - Map AI SDK usage (inputTokens/outputTokens) to ProviderTokenUsage in mapResponse - Add optional tokenUsage field to EvaluatorResult and EvaluationScore types - Capture token usage from LLM judge generateText() and provider.invoke() calls - Accumulate token usage across target proxy invocations - Surface proxy tokenUsage in code evaluator results - Extend TargetInvokeResponse with per-call tokenUsage for code judge scripts - Pass tokenUsage through orchestrator EvaluationScore → EvaluatorResult mapping Closes #387 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-02-26T08:32:44Z

Deploying agentv with Cloudflare Pages

Latest commit:	`eafcedc`
Status:	✅ Deploy successful!
Preview URL:	https://27741650.agentv.pages.dev
Branch Preview URL:	https://feat-387-token-usage-trackin.agentv.pages.dev

View logs

…sage, add unit tests - Replace 6 inline `{ input: number; output: number }` types with shared TokenUsage import - Add tokenUsage to ChildEvaluatorResult for accurate total cost tracking - Add 7 unit tests covering type contracts and proxy accumulation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

christso and others added 2 commits February 26, 2026 11:41

style: fix biome formatting in token-usage test

eafcedc

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

christso merged commit 833a4e6 into main Feb 26, 2026
1 check passed

christso deleted the feat/387-token-usage-tracking branch February 26, 2026 12:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: complete token usage tracking across agent target, LLM judges, and code judge proxy#390

feat: complete token usage tracking across agent target, LLM judges, and code judge proxy#390
christso merged 3 commits intomainfrom
feat/387-token-usage-tracking

christso commented Feb 26, 2026 •

edited

Loading

Uh oh!

cloudflare-workers-and-pages bot commented Feb 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christso commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

cloudflare-workers-and-pages bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying agentv with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

christso commented Feb 26, 2026 •

edited

Loading

cloudflare-workers-and-pages bot commented Feb 26, 2026 •

edited

Loading