Skip to content

[Query]: Adds ability to choose global vs local/focused statistics for FullTextScore#48431

Open
aayush3011 wants to merge 6 commits intoAzure:mainfrom
aayush3011:users/akataria/fullTextImprovements
Open

[Query]: Adds ability to choose global vs local/focused statistics for FullTextScore#48431
aayush3011 wants to merge 6 commits intoAzure:mainfrom
aayush3011:users/akataria/fullTextImprovements

Conversation

@aayush3011
Copy link
Member

@aayush3011 aayush3011 commented Mar 16, 2026

Description

Why?

Cosmos DB's implementation of FullTextScore computes BM25 statistics (term frequency, inverse document frequency, and document length) across all documents in the container, including all physical and logical partitions.

While this provides a valid and comprehensive representation of statistics for the entire dataset, it introduces challenges for several common use cases:

  • Multi-tenant scenarios: Tenants often operate in very different domains, which can significantly change the distribution and importance of keywords. Using global statistics leads to distorted relevance rankings for individual tenants.
  • Large containers with many partitions: Computing statistics across hundreds or thousands of physical partitions can be time-consuming and expensive. Customers may prefer statistics derived from only a subset of partitions to improve performance and reduce RU consumption.

What?

This PR extends the flexibility of BM25 scoring so that developers can choose between:

  • Global (default): FullTextScore computes BM25 statistics across all documents in the container, regardless of any partition key filters. This is the existing behavior.
  • Local: When a query includes a partition key filter, BM25 statistics are computed only over the subset of documents within the specified partition key values. Scores and ranking reflect relevance within that partition-specific slice of data.

How?

A new CosmosFullTextScoreScope enum and setFullTextScoreScope() method are added to CosmosQueryRequestOptions:

CosmosQueryRequestOptions options = new CosmosQueryRequestOptions();
options.setFullTextScoreScope(CosmosFullTextScoreScope.LOCAL);
options.setPartitionKey(new PartitionKey(tenantId));
   
container.queryItems(
   "SELECT TOP 10 * FROM c WHERE c.tenantId = @tenantId ORDER BY RANK FullTextScore(c.text, 'keywords')",
   options,
   Document.class
);

When CosmosFullTextScoreScope.LOCAL is set, the hybrid search aggregator uses only the query's target partition ranges (instead of all ranges) when executing the global statistics query. This is a client-side only change — no new HTTP headers are sent to the backend.

Bug Fixes (discovered during development)

  1. NullPointerException in DocumentQueryExecutionContextFactory.tryCacheQueryPlan:
    When executing hybrid search queries with a partition key filter, getQueryInfo() returned null (hybrid search queries use hybridSearchQueryInfo instead), causing a NPE in query plan caching. Added a null guard.
  2. Race condition in HybridSearchDocumentQueryExecutionContext.getComponentQueryResults:
    The documentProducers field in ParallelDocumentQueryExecutionContextBase is shared mutable state that was reused across multiple logical operations (global statistics query, component queries). When component queries ran via flatMap (concurrent), or when the global statistics Mono was re-subscribed after component queries had reassigned the field, it caused ConcurrentModificationException and IllegalArgumentException: retries must not be negative. Fixed by introducing a createProducers() helper that wraps super.initialize(), captures the produced list into a local variable, and clears the shared field, ensuring each logical operation (global stats, each component query) gets its own isolated producer list. Additionally changed flatMap to concatMap for component queries to serialize initialization of the parent class's shared metrics state.

Testing

  • Tests validate:
    - GLOBAL scope (default) cross-partition returns all matching results
    - Explicit GLOBAL matches default behavior
    - LOCAL scope + pk="2" returns only pk="2" results
    - LOCAL scope + pk="1" returns only pk="1" results
    - RRF queries work with both LOCAL and GLOBAL scopes

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@aayush3011 aayush3011 marked this pull request as ready for review March 16, 2026 21:10
@aayush3011 aayush3011 requested review from a team and kirankumarkolli as code owners March 16, 2026 21:10
Copilot AI review requested due to automatic review settings March 16, 2026 21:10
@aayush3011
Copy link
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a CosmosFullTextScoreScope option to CosmosQueryRequestOptions that lets developers choose between GLOBAL (default, all partitions) and LOCAL (scoped to target partitions) BM25 statistics computation for hybrid search queries. It also fixes two bugs: a NPE in query plan caching for hybrid queries and a ConcurrentModificationException race condition in component query execution.

Changes:

  • New CosmosFullTextScoreScope enum with GLOBAL/LOCAL values, wired through CosmosQueryRequestOptions
  • Bug fix: null guard for queryInfo in tryCacheQueryPlan, and synchronized block in getComponentQueryResults to prevent concurrent modification
  • Tests updated with new partition key structure (/pk) and new test methods for LOCAL/GLOBAL scope validation

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
CosmosFullTextScoreScope.java New enum defining GLOBAL and LOCAL scopes
CosmosQueryRequestOptions.java Public getter/setter for fullTextScoreScope
CosmosQueryRequestOptionsImpl.java Implementation field, copy constructor, getter/setter
HybridSearchDocumentQueryExecutionContext.java Uses scope to select statistics target ranges; synchronized fix for race condition
DocumentQueryExecutionContextFactory.java Null guard for queryInfo in tryCacheQueryPlan
CHANGELOG.md Documents new feature and bug fixes
HybridSearchQueryTest.java Updated partition key, new tests for LOCAL/GLOBAL scope, updated expected results

You can also share your feedback on Copilot code review. Take the survey.

@aayush3011
Copy link
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@xinlian12 xinlian12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@xinlian12
Copy link
Member

Deep Review Summary

PR Intent: Adds CosmosFullTextScoreScope enum (GLOBAL/LOCAL) for controlling BM25 statistics scope in hybrid search queries. Also fixes NPE in query plan caching and race condition (flatMap to concatMap) in hybrid search execution context.

Overall Assessment: The feature is well-designed and matches the .NET SDK's equivalent implementation. The bug fixes are correct. Main concerns are around the null-default inconsistency between public/Impl API layers, missing validation for the LOCAL-without-partition-key edge case, and test coverage gaps.

Existing Comments: Only a Copilot summary review (0 inline comments). No overlap with findings below.

Severity Count
🟡 Recommendation 6
🟢 Suggestion 3
💬 Observation 2

Top findings:

  1. Inconsistent null-default between public (GLOBAL) and Impl (null) getters creates a latent correctness trap
  2. LOCAL scope without partition key silently degenerates to GLOBAL with no warning
  3. @Ignore annotation commented out instead of properly removed

⚠️ AI-generated review — may be incorrect. Agree? → resolve the conversation. Disagree? → reply with your reasoning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants