Skip to content

Add support for Categoricals as a data type in the writer #450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mkleinbort-ic opened this issue Feb 20, 2024 · 0 comments · Fixed by #693
Closed

Add support for Categoricals as a data type in the writer #450

mkleinbort-ic opened this issue Feb 20, 2024 · 0 comments · Fixed by #693
Assignees

Comments

@mkleinbort-ic
Copy link

Feature Request / Improvement

This code does not work:

import polars as pl
df = pl.DataFrame({'x':['Hi']}, schema={'x':pl.Categorical}).to_arrow()

# pyarrow.Table
# x: dictionary<values=large_string, indices=uint32, ordered=0>
# ----
# x: [  -- dictionary:
# ["Hi"]  -- indices:
# [0]]

table = catalog.create_table(
    "default.test_table_01",
    schema=df.schema,
)

With error

TypeError: Unsupported type: dictionary<values=large_string, indices=uint32, ordered=0>

I'd be good to support Categorical data types

@Fokko Fokko added this to the PyIceberg 0.7.0 release milestone Feb 22, 2024
@sungwy sungwy self-assigned this Apr 30, 2024
@Fokko Fokko closed this as completed in #693 May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants