No support for precision reduction when reducing dataset size for pandas dataframe or series.

We currently have two methods for dataset size reduction, `precision` and `subsample`, introduced more clearly in PR #1250. However we have not implemented `precision` reduction with pandas dataframes as this is a bit more involved, when compared to the fact `ndarray`'s have a uniform type while dataframes ahave a type per column.

We also can not use `reduce_dataset_size_if_too_large` with dataframes yet as we have not implemented a method to calculate it's size, such that we know how much to `subsample`.

This shouldn't be too hard to implement but will require updating tests as well.

Edit:
Just adding an extra point to include more nuanced calculation for spare matrices.
`arr.data.nbytes + arr.indices.nbytes + arr.indptr.nbytes` 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

No support for precision reduction when reducing dataset size for pandas dataframe or series. #1278

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

No support for precision reduction when reducing dataset size for pandas dataframe or series. #1278

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions