-
Notifications
You must be signed in to change notification settings - Fork 618
Use ReadOnlyMemory<byte> for binary data to eliminate UTF-16 transcoding #1070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: stephentoub <[email protected]>
Co-authored-by: stephentoub <[email protected]>
Co-authored-by: stephentoub <[email protected]>
src/ModelContextProtocol.Core/Server/AIFunctionMcpServerResource.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: stephentoub <[email protected]>
src/ModelContextProtocol.Core/Server/AIFunctionMcpServerResource.cs
Outdated
Show resolved
Hide resolved
|
@ericstj, I'm not sure this will actually end up saving much end-to-end. It'd help if some of the data wasn't actually looked at, but if it ends up being looked at, it seems it'll actually make the end-to-end worse. |
Co-authored-by: stephentoub <[email protected]>
|
I see this PR is still going though a string when decoding - I thought we had API that would go straight from ROS (base64 UTF8) to byte[] without an intermediate string. |
We have operation status APIs that write into an existing destinatiom buffer, but no one-shot that creates the array. We do for Base64Url but we didn't square the circle for Base64. |
|
Can't we do Base64.GetMaxDecodedFromUtf8Length to get the size, create a byte array, then Base64.DecodeFromUtf8, followed by slice to final actual size? |
That's the operation status API I was referring to.
I thought you'd stated you wanted the decoded data property to be a byte[]. If it is, slicing would entail allocating a whole new array unless the max happened to equal actual, which would happen only ~33% of the time. If the destination property is a |
This is exactly what I was thinking of when mentioning:
And the current implementation is already exposing as |
Co-authored-by: ericstj <[email protected]>
Implemented using Base64.DecodeFromUtf8 to decode directly from UTF-8 bytes without string intermediate in commit 39213fb. The implementation:
|
…om data. These factories go straight from data -> base64 utf-8 encoded data without excess garbage.
…-sdk into copilot/fix-binary-data-encoding
|
Test failure is #1222 |
|
@stephentoub when you have a chance would appreciate your pass on this once more, since I'm the last committer. |
|
@copilot please address feedback |
…sed encoding, remove System.Text qualifiers Co-authored-by: ericstj <[email protected]>
Addressed all feedback in commit 0a3015f:
|
src/ModelContextProtocol.Core/Server/AIFunctionMcpServerResource.cs
Outdated
Show resolved
Hide resolved
|
@copilot please address feedback, and be sure to consider variants of each suggestion. |
…dio to FromBytes, add Base64 helpers, fix Encoding references Co-authored-by: ericstj <[email protected]>
Addressed all feedback in commit e4f2816:
|
|
Ok, I think this is ready for another pass. I looked elsewhere in MCP and found #1259. Not going to tackle that in this PR but will follow up. I think that would create more large garbage strings. |
Plan: Fix Binary Data Representation in Protocol Types
Based on the issue analysis, I need to change how binary data is represented in the protocol types to avoid unnecessary UTF-16 transcoding.
Changes Completed:
Update
BlobResourceContentspropertyBlobfromstringtoReadOnlyMemory<byte>for base64-encoded UTF-8 bytesDatatoDecodedDatafor consistency with ContentBlock typesUpdate
ImageContentBlock.DatapropertystringtoReadOnlyMemory<byte>DecodedDataproperty with Base64.DecodeFromUtf8 (no string intermediate)Update
AudioContentBlock.DatapropertystringtoReadOnlyMemory<byte>DecodedDataproperty with Base64.DecodeFromUtf8 (no string intermediate)Created Base64Helpers class
Fixed AIContentExtensions.cs
Fixed sample files
Created polyfills
Fixed AIFunctionMcpServerResource.cs
Updated all tests
Build and test
Summary:
Successfully converted binary data properties from
stringtoReadOnlyMemory<byte>with complete elimination of UTF-16 transcoding:Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.