Closed
Description
Describe the bug
When creating a literal interval value from a pyarrow scalar, the values for month, day, and nanoseconds are not correctly assigned in the literal values. The following minimal example will reproduce. This appears to be limited to datafusion-python
and not the rust implementation.
To Reproduce
print("Setting 1 month interval:")
pa_interval = pa.scalar((1, 0, 0), type=pa.month_day_nano_interval())
print("pa_interval:", pa_interval)
lit_interval = lit(pa_interval)
print("lit_interval:", lit_interval)
df.select(lit_interval).limit(1).show()
print("Setting 1 day interval:")
pa_interval = pa.scalar((0, 1, 0), type=pa.month_day_nano_interval())
print("pa_interval:", pa_interval)
lit_interval = lit(pa_interval)
print("lit_interval:", lit_interval)
df.select(lit_interval).limit(1).show()
print("Setting 1 nanosecond interval:")
pa_interval = pa.scalar((0, 0, 1), type=pa.month_day_nano_interval())
print("pa_interval:", pa_interval)
lit_interval = lit(pa_interval)
print("lit_interval:", lit_interval)
df.select(lit_interval).limit(1).show()
Produces the following result:
Setting 1 month interval:
pa_interval: MonthDayNano(months=1, days=0, nanoseconds=0)
lit_interval: Expr(IntervalMonthDayNano("1"))
DataFrame()
+-------------------------------------------------------+
| IntervalMonthDayNano("1") |
+-------------------------------------------------------+
| 0 years 0 mons 0 days 0 hours 0 mins 0.000000001 secs |
+-------------------------------------------------------+
Setting 1 day interval:
pa_interval: MonthDayNano(months=0, days=1, nanoseconds=0)
lit_interval: Expr(IntervalMonthDayNano("4294967296"))
DataFrame()
+-------------------------------------------------------+
| IntervalMonthDayNano("4294967296") |
+-------------------------------------------------------+
| 0 years 0 mons 0 days 0 hours 0 mins 4.294967296 secs |
+-------------------------------------------------------+
Setting 1 nanosecond interval:
pa_interval: MonthDayNano(months=0, days=0, nanoseconds=1)
lit_interval: Expr(IntervalMonthDayNano("18446744073709551616"))
DataFrame()
+-------------------------------------------------------+
| IntervalMonthDayNano("18446744073709551616") |
+-------------------------------------------------------+
| 0 years 0 mons 1 days 0 hours 0 mins 0.000000000 secs |
+-------------------------------------------------------+
Expected behavior
When setting an interval value of 1 month in pyarrow, it should show up as 1 month in the datafusion data frame, and so on for the other values.
Additional context
Add any other context about the problem here.