Skip to content

Conversation

AdamGS
Copy link
Contributor

@AdamGS AdamGS commented Sep 24, 2025

Which issue does this PR close?

Rationale for this change

Fills out support for more the new smaller decimal types.

What changes are included in this PR?

  1. Fills out coercion between various decimal types for binary operations (comparison and arithmetic)
  2. Support for the abs function
  3. Support for sorted merge join (including for Decimal256, which I'm not sure if it was omitted before or explicitly unsupported).
  4. Support for decimal32/64 in the slt engine

Are these changes tested?

Yes, both new dedicated rust tests and additional dedicated SLT tests, in addition to existing functionality.

Are there any user-facing changes?

Yes - there are a few major user-facing changes:

  1. Tables or values casted to decimal types with an appropriate precision, will now be a smaller decimal type.
  2. In some cases - failure to coerce a type might result in a significant change. For example - the spark slt tests including the following test:
SELECT MOD(2::int, 1.8::decimal(2,1));

before this PR, that test returned a 0.2 with a datatype of Decimal128(2, 1). With this PR, this exact statement returns 0 with a datatype of Int32, because it actually can't coerce a Int32 to a Decimal32(2, 1).
Should it fall-back to floating point types? Some wide-enough predetermined decimal variant? I didn't debug this path fully yet, so the answer potentially hides somewhere there.

Either way - the second breaking change seems to be the biggest open issue I have in this PR, once there's consensus on the desired behavior here, fixing the code and tests appropriately is the easy part.

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt) common Related to common crate functions Changes to functions implementation physical-plan Changes to the physical-plan crate spark labels Sep 24, 2025
@AdamGS
Copy link
Contributor Author

AdamGS commented Sep 25, 2025

After sleeping on it, my current thinking is:

  1. Most SQL oriented systems have on decimal type that hides the underlying size in some way or is just fixed size, I know there's work/thinking going into a more logical-type system in DataFusion, but for now introducing new types is a major change.
  2. Therefore, Decimal32/64 parsing from SQL should be behind some config (I'm not 100% sure if there's a precedence for that or how well wired configuration is to the parser - but that's a change that I can imagine being useful for other things in the future).
  3. The coercion logic should stay as it is here - trying to fit integers/floats into decimals when applicable, or the wider decimal when trying to operate on two decimal types.
  4. There are other changes that I ran into in this PR that could be split into a separate PR that is a natural followup to Support Decimal32/64 types #17501 - spark::width_bucket, abs, SMJ support and creating zero scalars.

@AdamGS AdamGS marked this pull request as draft September 25, 2025 23:20
@AdamGS
Copy link
Contributor Author

AdamGS commented Sep 25, 2025

changed this PR back to a draft, I'm going to split out some of the easier stuff out to make it easier to review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate functions Changes to functions implementation logical-expr Logical plan and expressions physical-plan Changes to the physical-plan crate spark sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support smaller decimal types through SQL interface
1 participant