-
Notifications
You must be signed in to change notification settings - Fork 263
Replace parquet metadata thrift version with in memory version. #1004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@liurenjie1024 I think the current problem with this is that |
I think we should always return the in memory representation, rather the thrift one. Is there any case where returning the thrift one is more useful then the in memory one? |
FileMetadata
in parquet writer with in memory representation.
Probably not, so should we cahnge the AsyncFileWriter to return the in memory representation? |
Yes, but it seems there is no built no approach to do that? We may need to ask for help in arrow community? |
Yes, I'll look to submit an issue |
Hi, @jonathanc-n I found this method in
|
Thanks for that! I'll look into it later today. |
## Which issue does this PR close? - Closes #1033 and #1004. ## What changes are included in this PR? Add conversion from filemetadat to parquet metadata using thrift `decode_metadata` <!-- Provide a summary of the modifications in this PR. List the main changes such as new features, bug fixes, refactoring, or any other updates. --> ## Are these changes tested? <!-- Specify what test covers (unit test, integration test, etc.). If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? -->
I think this issue is already complete. |
Uh oh!
There was an error while loading. Please reload this page.
In parquet crate, there are two kinds of data structures for metadata: in memory version vs auto generated version from parquet's thrift definition. For example, there are two versions of
FileMetadata
: in memory vs thrift definition.We should use the in memory one as it provides more features, while thrift version was only used for ser/de in parquet.
There are several places in our crate which is using thrift version:
The text was updated successfully, but these errors were encountered: