Skip to content

Conversation

@neilconway
Copy link
Contributor

When to_timestamp() was invoked with a scalar Float64 and an array of strings, the previous coding neglected to broadcast the scalar to the array properly when producing the return value. That is, a query like SELECT to_timestamp(123.5, t.x) FROM t would result in:

Internal error: UDF to_timestamp returned a different number of rows
than expected

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Yes, added SLT.

Are there any user-facing changes?

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Feb 9, 2026
When to_timestamp() was invoked with a scalar Float64 and an array of
strings, the previous coding neglected to broadcast the scalar to the
array properly when producing the return value.  That is, a query like
`SELECT to_timestamp(123.5, t.x) FROM t` would result in:

    Internal error: UDF to_timestamp returned a different number of rows
    than expected
@neilconway neilconway force-pushed the neilc/udf-scalar-broadcast branch from d6b728a to 143a333 Compare February 9, 2026 00:55
(2, 'bar');

query P
SELECT to_timestamp(123.5, name) FROM test_to_timestamp_scalar ORDER BY id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How foo and bar are interpreted as formats here ?!
According to https://github.com/apache/datafusion/pull/20224/changes#diff-853711994ef45ba29e048e6db0c6ad4e015f8f30299d1f39f6c7bca8b4f4fd36R282-R283:

If none of the formats successfully parse the expression an error will be returned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation, the second and subsequent arguments are completely ignored, unless the first argument is a string. Passing format arguments when the first argument isn't a strict doesn't actually make much sense, and in fact Postgres only implements to variants:

to_timestamp(double) -> timestamptz
to_timestamp(text, text) -> timestamptz

On that basis, I can revise this PR to reject attempts to pass two or more arguments when the first argument is not a string. Does that sound reasonable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining!

I can revise this PR to reject attempts to pass two or more arguments when the first argument is not a string

Maybe is a follow-up. No need to do mix two improvements in the same PR.

@alamb alamb added this pull request to the merge queue Feb 12, 2026
@alamb
Copy link
Contributor

alamb commented Feb 12, 2026

Thanks @Jefffrey @neilconway and @martin-g

Merged via the queue into apache:main with commit 73bce15 Feb 12, 2026
28 checks passed
@neilconway neilconway deleted the neilc/udf-scalar-broadcast branch February 12, 2026 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

to_timestamp() fails when passed mix of scalar and array args

4 participants