-
Notifications
You must be signed in to change notification settings - Fork 1k
[Variant]: Implement DataType::Union
support for cast_to_variant
kernel
#8196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
} | ||
|
||
/// Convert dictionary encoded arrays | ||
fn convert_dictionary_encoded( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change just moves the code from the Dictionary
branch into this helper function. Since all the impls use slightly different coding styles, we could do a follow-up PR to make them consistent once all the variant cast implementations are complete.
f34bcd2
to
6dda6a2
Compare
|
||
// Convert each child array to variant arrays | ||
let mut child_variant_arrays = HashMap::new(); | ||
for (type_id, _) in fields.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to merge the two passes into one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you are suggesting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean using one loop instead of two? That way we will compute the child array of each type_id on demand.
But IIUC, we will use all the child arrays anyway if it's a valid union, and I think precomute all the child arrays may benefit the lookup in the second loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, we'll use the same child_variant_array many times. The current would have a better performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @liamzwbao -- this looks great. The only thing I think it is missing is a test for nulls in the UnionArray.
Thank you @klion26 for the review
let value = child_variant_array.value(value_offset); | ||
builder.append_variant(value); | ||
} else { | ||
// This should not happen in a valid union, but handle gracefully |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
// Convert each child array to variant arrays | ||
let mut child_variant_arrays = HashMap::new(); | ||
for (type_id, _) in fields.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you are suggesting
} | ||
|
||
#[test] | ||
fn test_cast_to_variant_union_sparse() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we please add a test for a UnionArray where the child element is null? So that the output VariantArray has a null as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @liamzwbao
I also merged up from main to resolve a merge conflict with this PR
🚀 |
Which issue does this PR close?
DataType::Union
support forcast_to_variant
kernel #8195.Rationale for this change
What changes are included in this PR?
Implement
DataType::Union
forcast_to_variant
Are these changes tested?
Yes
Are there any user-facing changes?
New cast type supported