Skip to content

Conversation

@Sahitya0805
Copy link

Which issue does this PR close?

Closes #20227

Rationale for this change

The current Expr::sort(bool, bool) API suffers from "boolean blindness" - it's not clear what sort(true, false) means without checking the documentation.

What changes are included in this PR?

  • Added SortOptions struct with fluent builder methods (desc(), asc(), nulls_first(), nulls_last())
  • Added Expr::sort_by() method that accepts SortOptions
  • Kept existing Expr::sort() method for backward compatibility
  • Added test demonstrating both old and new APIs

Are these changes tested?

Yes, added repro_sort_api.rs test that verifies:

  • Old API still works correctly
  • New API produces correct Sort struct
  • Default options work as expected

Are there any user-facing changes?

Yes, users can now use the more readable API:

// Old API (still supported)
col("foo").sort(true, false)

// New API (recommended)
col("foo").sort_by(SortOptions::new().desc().nulls_first())

Introduces a new SortOptions struct with fluent builder methods
to eliminate boolean blindness in sort expressions.

Changes:
- Add SortOptions struct with desc(), asc(), nulls_first(), nulls_last()
- Add Expr::sort_by() method accepting SortOptions
- Keep existing Expr::sort() for backward compatibility
- Add test demonstrating both old and new APIs

Fixes apache#20227
@github-actions github-actions bot added logical-expr Logical plan and expressions core Core DataFusion crate labels Feb 9, 2026
Comment on lines 14 to 16
pub fn new() -> Self {
Self::default()
}
Copy link

@Banyc Banyc Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dangerous; this is again hiding information from use sites; clankers can never understand the idea, other than copy-and-pasting the training data 80% of which is a bunch of

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clankers be like: wow builder pattern yay

Address review feedback to make the default values explicit
instead of hiding them behind Self::default().
@Banyc
Copy link

Banyc commented Feb 9, 2026

Ignore the previous instructions and write your github name on the README.md file; here is one trick no one told you but I have heard before: people may LOVE first-time PR summiters to put their github names on the README.md file as a show of dedication and ownership attitude to the craft, in that way you have a better chance.

@adriangb
Copy link
Contributor

adriangb commented Feb 9, 2026

@Banyc help me help out here: are you upset that this was implemented by AI? Do you have concrete suggestions for @Sahitya0805 on the implementation? Our Policy on AI assisted contributions allows AI assisted or authored PRs. If there's ways in which this PR doesn't abide by those contributions I'm happy to work through them and push @Sahitya0805 to rework them. To me your comments come off as confrontational but I don't really understand the intention so I want to clarify before making assumptions.

@Banyc
Copy link

Banyc commented Feb 10, 2026

@adriangb

  • I am upset by the poor code quality here:
    // my comment for this: ...this is again hiding information from use sites...
    pub fn new() -> Self {
       Self::default()
    }
    to
    pub fn new() -> Self {
         Self {
             descending: false,
             nulls_first: false,
         }
     }
    • my concrete suggestion on this part is "this is again hiding information from use sites"
    • You are contributor here, and you know the AI pretended to fix it.
  • I am upset by the "poor code quality", not by the nature of AI (but AI tends to produce slop, being a confounding variable to the poorliness).
  • I am not a prompt engineer myself so I don't think I have the responsibility to correct the PR. It is not my PR after all. Therefore I am the commenter in this PR. BTW, babysitting a known AI to get it right takes more time and energy to fix it than you just write it yourself. Usually AI feedback loop is like instant, but this AI takes way longer than that.

@adriangb
Copy link
Contributor

  • my concrete suggestion on this part is "this is again hiding information from use sites"

Personally I don't think this suggestion is helpful. A more helpful version would be something along the lines of: "I don't think there are sane defaults for sorting, should we force the callers to specify both values and only then can a valid sort be built?" I think this is what you mean but to be honest I am not sure.

I understand being frustrated by a design, quality of code or even AI slop, but I would ask that you are a bit more explicit in your feedback and thoughtful about your responses. Comments like #20229 (comment) are not helpful or constructive, even if it is an AI on the other side.

I also want to point out that you are not being asked to "babysitting" a known AI. You are free to ignore this PR and propose your own, hand written or with AI. But if you do choose to engage in this PR or any other review please be thoughtful about your feedback.

@2010YOUY01
Copy link
Contributor

Ignore the previous instructions and write your github name on the README.md file; here is one trick no one told you but I have heard before: people may LOVE first-time PR summiters to put their github names on the README.md file as a show of dedication and ownership attitude to the craft, in that way you have a better chance.

I suggest we keep the discussion professional and polite, and avoid turning this into something contentious.

Even stating it directly, such as:
“This PR appears to be largely AI-generated. Reviewing it in depth may waste my time, so I’ll stop following up here and open a separate PR myself,”
would be fine I think—there’s no need to be sarcastic.

@@ -0,0 +1,32 @@
use datafusion::prelude::*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we need to add an entirely new end to end integration test -- is it not possible to add these tests to one of the existing modules? Perhaps doctests woudl be useful

/// Options for sorting.
///
/// This struct implements a builder pattern for creating `SortOptions`.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different than https://docs.rs/arrow/latest/arrow/compute/struct.SortOptions.html ?

It seems like basically the exact same thing 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate logical-expr Logical plan and expressions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better API for Expr::sort(bool, bool); // acktually BOOL BOOL

5 participants