Python: Normalize OpenAI function-call arguments at parse time to prevent uni… by 0x7c13 · Pull Request #4831 · microsoft/agent-framework

0x7c13 · 2026-03-22T08:32:46Z

Python: Normalize OpenAI function-call arguments at parse time to prevent unicode escape corruption

Problem

When an LLM-powered agent edits source files containing Python/JavaScript unicode escape sequences like \u2192, the OpenAI code path corrupts these sequences due to double JSON parsing.

Root cause

The Anthropic and OpenAI backends handle function-call arguments differently:

Anthropic: Returns content_block.input as a parsed dict. Stored directly — parse_arguments() returns it as-is. 1 JSON parse total.
OpenAI: Returns tool.function.arguments as a raw JSON string. Stored as a string, then parse_arguments() calls json.loads() again. 2 JSON parses total.

The second json.loads() re-interprets \uXXXX sequences as JSON unicode escapes, corrupting the original intent:

# A source file contains the Python escape: \u2192
# The model correctly generates \\u2192 in its JSON arguments

# Anthropic path (1 parse):
content_block.input = {"old_string": "\\u2192"}  # SDK parsed → \u2192 ✓

# OpenAI path (2 parses):
tool.function.arguments = '{"old_string": "\\u2192"}'  # stored as string
json.loads(arguments)    → {"old_string": "→"}          # \u2192 interpreted as unicode escape ✗

The same model output that works correctly on Anthropic produces a corrupted value on OpenAI. The \u2192 (literal 6-char Python escape) becomes → (a single Unicode character), causing edit_file to either fail to match or write incorrect content.

Impact

This affects any tool that reads/writes source code containing \uXXXX escape sequences (Python, JavaScript, Java, C#, JSON). In practice, agents enter retry loops (10+ failed edit_file attempts observed) trying different escaping levels, wasting tokens and often ultimately writing corrupted code.

What changed

Added normalize_function_call_arguments() helper in _types.py that eagerly parses JSON-string arguments into dicts at the provider-parsing layer
Applied normalization in OpenAIChatClient._parse_tool_calls_from_openai() and three non-streaming parse sites in OpenAIResponsesClient
Updated _prepare_content_for_openai() in the responses client to re-serialize dict arguments back to JSON strings when sending to the API (the chat client already handled this at line 704)
Updated 2 test assertions that expected raw string arguments to expect parsed dicts

Streaming deltas (response.function_call_arguments.delta) are intentionally not normalized since they contain partial JSON fragments.

Validation

uv run python -m pytest packages/core/tests/openai/test_openai_chat_client.py \
  packages/core/tests/openai/test_openai_responses_client.py \
  -m "not integration" -q

All 183 tests pass.

Before / After comparison

from agent_framework._types import normalize_function_call_arguments

# Model generates \\u2192 in its JSON output — the correct escaping for literal \u2192
args = '{"old_string": "\\\\u2192"}'

# BEFORE: stored as string, then double-parsed
import json
json.loads(args)["old_string"]   # → '\\u2192' (2 backslashes — wrong)

# AFTER: normalized once at parse time, parse_arguments() returns dict directly
normalize_function_call_arguments(args)["old_string"]  # → '\\u2192' (same parse)
# Then parse_arguments() sees a Mapping and returns it — no second json.loads

The fix makes the OpenAI path behave identically to the Anthropic path: arguments are parsed once and stored as a dict. parse_arguments() returns the dict directly without a second json.loads() call.

Python: Normalize provider tool-call argument envelopes across chat backends #4740 / Python: Normalize OpenAI tool-call argument envelopes on parse #4741 — Same problem space (OpenAI argument envelope normalization), but focused on the string-vs-dict type inconsistency rather than the unicode escape corruption specifically.

…code escape corruption

0x7c13 · 2026-03-22T08:37:55Z

@microsoft-github-policy-service agree

Normalize OpenAI function-call arguments at parse time to prevent uni…

c29da11

…code escape corruption

markwallace-microsoft added the python label Mar 22, 2026

github-actions bot changed the title ~~Normalize OpenAI function-call arguments at parse time to prevent uni…~~ Python: Normalize OpenAI function-call arguments at parse time to prevent uni… Mar 22, 2026

0x7c13 mentioned this pull request Mar 22, 2026

Python: [Bug]: Normalize OpenAI function-call arguments at parse time to prevent unicode escape corruption #4832

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Normalize OpenAI function-call arguments at parse time to prevent uni…#4831

Python: Normalize OpenAI function-call arguments at parse time to prevent uni…#4831
0x7c13 wants to merge 1 commit intomicrosoft:mainfrom
0x7c13:dev/0x7c13/unicode_corruption_fix

0x7c13 commented Mar 22, 2026

Uh oh!

0x7c13 commented Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

0x7c13 commented Mar 22, 2026

Python: Normalize OpenAI function-call arguments at parse time to prevent unicode escape corruption

Problem

Root cause

Impact

What changed

Validation

Before / After comparison

Related

Uh oh!

0x7c13 commented Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants