RUBY-3786 Retry inside txns on overload errors#2999
Merged
comandeo-mongo merged 3 commits intomongodb:masterfrom Mar 17, 2026
Merged
RUBY-3786 Retry inside txns on overload errors#2999comandeo-mongo merged 3 commits intomongodb:masterfrom
comandeo-mongo merged 3 commits intomongodb:masterfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds support for retrying reads/writes inside transactions when the server returns overload-related labels (RetryableError + SystemOverloadedError), and updates unified transaction spec tests to cover the new behavior.
Changes:
- Allow overload retry logic to run for in-transaction reads/writes (instead of immediately raising).
- Preserve
startTransaction: trueon retries of the first transactional write by reverting session state before retry. - Track “overload-only” retry sequences to avoid upgrading
commitTransactionwrite concern tow: majoritywhen all failures were overload-only.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| spec/spec_tests/data/transactions_unified/backpressure-retryable-writes.yml | New unified spec coverage for retrying writes in txns on overload labels. |
| spec/spec_tests/data/transactions_unified/backpressure-retryable-reads.yml | New unified spec coverage for retrying reads in txns on overload labels. |
| spec/spec_tests/data/transactions_unified/backpressure-retryable-commit.yml | New unified spec coverage for overload retries on commitTransaction. |
| spec/spec_tests/data/transactions_unified/backpressure-retryable-abort.yml | New unified spec coverage for overload retries on abortTransaction. |
| lib/mongo/session.rb | Skip w: majority upgrade on commit retries when retries were overload-only; add revert_to_starting_transaction!. |
| lib/mongo/retryable/write_worker.rb | Plumb overload-only retry flag and revert session state to preserve startTransaction: true for first-op write retries. |
| lib/mongo/retryable/read_worker.rb | Allow overload retries in transactions for reads. |
| lib/mongo/operation/context.rb | Add overload_only_retry? flag accessor on operation context. |
Comments suppressed due to low confidence (1)
lib/mongo/retryable/read_worker.rb:217
- Overload retries for reads inside transactions now proceed past this guard, but the overload retry path does not restore
STARTING_TRANSACTION_STATEwhen the failing read is the first command in a transaction. BecauseSession#update_state!runs during message build, the session becomesTRANSACTION_IN_PROGRESS_STATEafter the first attempt, and a retry may omitstartTransaction: true, breaking the transaction. Capture whether the session was starting before the first attempt and callsession.revert_to_starting_transaction!before performing an overload retry (similar to WriteWorker). Adding a unified test where the first operation is afindthat fails with overload labels would prevent regressions.
def modern_read_with_retry(session, server_selector, context, &block)
server = select_server(
cluster,
server_selector,
session,
timeout: context&.remaining_timeout_sec
)
result = yield server
retry_policy.record_success(is_retry: false)
result
rescue *retryable_exceptions, Error::OperationFailure::Family, Auth::Unauthorized, Error::PoolError => e
e.add_notes('modern retry', 'attempt 1')
raise e if session.in_transaction? && !retryable_overload_error?(e)
if retryable_overload_error?(e)
overload_read_retry(e, session, server_selector, context, server, error_count: 1, &block)
else
raise e if !is_retryable_exception?(e) && !e.write_retryable?
retry_read(e, session, server_selector, context: context, failed_server: server, &block)
end
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Comment on lines
+122
to
+136
| - commandStartedEvent: | ||
| command: | ||
| abortTransaction: | ||
| $$exists: false | ||
| lsid: | ||
| $$sessionLsid: *session0 | ||
| txnNumber: | ||
| $numberLong: "1" | ||
| startTransaction: | ||
| $$exists: false | ||
| autocommit: false | ||
| writeConcern: | ||
| $$exists: false | ||
| commandName: commitTransaction | ||
| databaseName: admin |
Comment on lines
+129
to
+143
| - commandStartedEvent: | ||
| command: | ||
| abortTransaction: | ||
| $$exists: false | ||
| lsid: | ||
| $$sessionLsid: *session0 | ||
| txnNumber: | ||
| $numberLong: "1" | ||
| startTransaction: | ||
| $$exists: false | ||
| autocommit: false | ||
| writeConcern: | ||
| $$exists: false | ||
| commandName: commitTransaction | ||
| databaseName: admin |
jamis
previously approved these changes
Mar 16, 2026
jamis
approved these changes
Mar 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements DRIVERS-3411 (RUBY-3786): retry reads and writes inside transactions
on overload errors (
RetryableError+SystemOverloadedErrorlabels).Key changes:
startTransaction: trueon retries of the first command viarevert_to_starting_transaction!w: majoritywrite concern upgrade oncommitTransactionwhen all failures were overload-only