Skip to content

Add create_test_data to public testing API #2690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Add create_test_data to public testing API #2690

wants to merge 4 commits into from

Conversation

TomNicholas
Copy link
Member

@TomNicholas TomNicholas commented Jan 18, 2019

Copy link
Collaborator

@max-sixty max-sixty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TomNicholas 🙏

@shoyer
Copy link
Member

shoyer commented Jan 18, 2019

I am slightly reluctant about this because it would mean effectively freezing the output of this helper function. I guess that would probably be OK, but the design was never quite fully considered!

@max-sixty
Copy link
Collaborator

it would mean effectively freezing the output of this helper function

We could add a param or version arg so we can make changes in the future, if this is a concern?

@TomNicholas
Copy link
Member Author

it would mean effectively freezing the output of this helper function

Or we could have a public create_example_data, and an internal create_test_data? That way to begin with create_example_data is just an alias to the internal create_test_data, but if we end up needing to change the internal one (to test for some new type of common indexing edge case for example) then we can just separate them?

@shoyer
Copy link
Member

shoyer commented Jan 20, 2019

Why not just encourage users to copy/paste the helper function into their own codebases? This seems like a pretty good use case for copy/paste.

Otherwise -- maybe we should think a little more carefully about what to include on the result. Or require explicitly providing a name for the desired dataset structure? E.g., xarray.testing.dummy_dataset('some_name')

@TomNicholas
Copy link
Member Author

Why not just encourage users to copy/paste the helper function into their own codebases?

Just because they aren't going to know it exists unless we put it in a public place I suppose?

Happy to do any of those things, I just think that having some kind of public function/copyable code example etc is a simple thing for us to do that will make it just a bit easier for users who want to build upon xarray.

@shoyer
Copy link
Member

shoyer commented Jan 21, 2019 via email

@max-sixty
Copy link
Collaborator

The reason I had liked this was that I'm often creating a dataset manually, either in a repl or as a fixture in a test, and that added an overhead in both time and reduced consistency.
I recently started using the tutorial data, but that requires a network connection (and I kept forgetting the name...)

But I can empathize re compatibility issues; I defer to @shoyer on the balance, or whether xarray.testing.dummy_dataset('some_name') would be feasible

@shoyer
Copy link
Member

shoyer commented Jan 21, 2019

Maybe having a simple example available under just xarray.testing.dummy_dataset() is a good idea after all. If we do ever want to change it for xarray's own testing purposes, that would be quite easy.

That said... if we're going to emesh this in our API lets at least think a little bit to make sure we cover most of the special cases we care about. I think we cover most of the dtypes we care about, but let's also add at least:

  • Some global attributes
  • A scalar data variable (it's easy to forget about these)

Anything else?

We might also consider tweaking the names -- dim1, dim2 and dim3 is not very descriptive. Even x, y and z would be better.

@max-sixty
Copy link
Collaborator

max-sixty commented Jan 21, 2019

The only other features I could imagine adding would be a coord over multiple dimensions, and a multi-indexed dimension (but potentially we leave the latter out until we're completely confident about the conventions)

Agree on the names!

@TomNicholas if you want to split this up at all lmk

@stale
Copy link

stale bot commented Jun 9, 2021

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

@stale stale bot added the stale label Jun 9, 2021
@keewis keewis closed this Jun 23, 2021
@keewis keewis deleted the branch pydata:master June 23, 2021 16:14
@keewis
Copy link
Collaborator

keewis commented Jun 24, 2021

this has been unintentionally closed while renaming main, probably because the feature branch was removed. Feel free to reopen if you want to work on this in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Is create_test_data() public API?
4 participants