Skip to content

Extended thinking/reasoning events not firing for OpenAI and Anthropic models via Copilot SDK #922

@KennyTurtle

Description

@KennyTurtle

Summary

The Copilot SDK v0.2.0 supports reasoning_effort at session creation and documents assistant.reasoning / assistant.reasoning_delta events for extended thinking. However, testing shows these don't work for Anthropic models through the BYOM provider path.

What we tested

We tested with the Copilot CLI v1.0.10 (bundled with SDK v0.2.0) using the Anthropic BYOM provider.

Test 1: Our model alias

session = await client.create_session(
    model="Sonnet46",
    provider={"type": "anthropic", "base_url": "...", "api_key": "..."},
    reasoning_effort="high",
)

Result: No thinking parameter in the proxied HTTP request body. No assistant.reasoning events. No thinking content in the API response.

Test 2: SDK's own model name format

session = await client.create_session(
    model="claude-sonnet-4-6",  # Same format as SDK's test/scenarios/prompts/reasoning-effort/python/main.py
    provider={"type": "anthropic", "base_url": "...", "api_key": "..."},
    reasoning_effort="high",
)

Result: Same — no thinking parameter in the request body. The proxied request contains max_tokens, temperature, system, messages, tools but no thinking block.

Test 3: OpenAI models (control)

session = await client.create_session(
    model="GPT5Reasoning",
    provider={"type": "openai", "base_url": "...", "api_key": "..."},
    reasoning_effort="high",
)

Result: Reasoning tokens are used (192 reasoning tokens in the response usage). However, still no assistant.reasoning events — the reasoning content stays opaque/encrypted.

Expected behavior

Per docs/features/streaming-events.md:

assistant.reasoning — Complete extended thinking block from the model. Emitted after reasoning is finished.
assistant.message and assistant.reasoning (final events) are always sent regardless of streaming setting.

We expected:

  1. The CLI to translate reasoning_effort into Anthropic's thinking: { type: "enabled", budget_tokens: N } API parameter for Anthropic BYOM sessions
  2. assistant.reasoning events to fire with the thinking content

Actual behavior

OpenAI (GPT5Reasoning) Anthropic (claude-sonnet-4-6)
reasoning_effort accepted Yes (no error) Yes (no error)
Reasoning tokens used Yes (192 tokens) No
thinking param in request N/A (OpenAI format) Not sent
assistant.reasoning event Not fired Not fired
assistant.usage trace Yes Yes

SDK test scenario reference

The SDK's own test at test/scenarios/prompts/reasoning-effort/python/main.py uses claude-opus-4.6 with reasoning_effort: "low", but verify.sh notes:

Note: reasoning effort is configuration-only and can't be verified from output alone.

This suggests the parameter may not actually produce observable thinking output for Anthropic models today.

Questions for SDK team

  1. Does the Copilot CLI support Anthropic extended thinking via the BYOM provider path?
  2. If so, what model name / configuration is required to trigger it?
  3. When should we expect assistant.reasoning events to fire? Only with streaming enabled, or also in non-streaming mode?

Priority

P1

Metadata

Metadata

Assignees

No one assigned

    Labels

    mcsMicrosoft Copilot StudioruntimeRequires a change in the copilot-agent-runtime repo

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions