Skip to content

Add rechunk_no_shuffle utility#1068

Merged
brendancol merged 4 commits intomasterfrom
issue-1067
Mar 24, 2026
Merged

Add rechunk_no_shuffle utility#1068
brendancol merged 4 commits intomasterfrom
issue-1067

Conversation

@brendancol
Copy link
Contributor

Summary

  • Adds rechunk_no_shuffle(agg, target_mb=128) to xrspatial.utils — computes an integer multiplier per dimension so new chunks are exact multiples of source chunks, letting dask merge blocks in place instead of shuffling partial blocks
  • Available on the .xrs DataArray accessor and as a top-level import (from xrspatial import rechunk_no_shuffle)
  • Non-dask arrays pass through unchanged

Test plan

  • 10 dedicated tests covering exact-multiple guarantee, growth, 3D input, value preservation, coord/attr preservation, numpy passthrough, input validation, and accessor integration
  • Accessor method list test updated and passing
  • All 35 tests pass

Closes #1067

Computes integer multiplier per dimension so new chunks are exact
multiples of source chunks, avoiding the shuffle dask triggers when
it has to split and recombine partial blocks.

Available as xrspatial.rechunk_no_shuffle() and on the .xrs accessor.
@github-actions github-actions bot added the performance PR touches performance-sensitive code label Mar 24, 2026
@brendancol brendancol merged commit d8b4a0d into master Mar 24, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rechunk_no_shuffle utility

1 participant