Skip to content

Conversation

phofl
Copy link
Collaborator

@phofl phofl commented Feb 14, 2024

Not adding this to the token makes us thing this is the same operation, silently reading the wrong data

The example might be a little bit too constructed, but if we start caching more this can happen in different scenarios

"_partitions": None,
"_series": False,
"_dataset_info_cache": None,
"_cwd": None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure the parquet reader does not need this since we're actually computing a checksum for all the files. if that doesn't work, we should make sure that the checksum is reliable and we may want the same/similar mechanism for csv instead of relying on CWD. CWD feels odd when working with remote storages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants