Skip to content

Allow for plotting dummy netCDF4.datetime objects. #1261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 9, 2017
Merged

Allow for plotting dummy netCDF4.datetime objects. #1261

merged 7 commits into from
Mar 9, 2017

Conversation

willirath
Copy link
Contributor

Currently, xarray/plot.py raises a TypeError if the data to be plotted are not among [np.floating, np.integer, np.timedelta64, np.datetime64].

This PR adds netCDF4.datetime objects to the list of allowed data types. These occur, because xarray/conventions.py passes undecoded netCDF4.datetime objects if decoding to numpy.datetime64 fails.

@shoyer
Copy link
Member

shoyer commented Feb 11, 2017

Does matplotlib actually plot netcdftime datetimes correctly?

It would be nice to have a unit test to verify this works. See xarray/tests/test_plot.py for examples.

@willirath
Copy link
Contributor Author

Failing tests:

xarray/tests/test_plot.py::TestPlot1D::test_nonnumeric_index_raises_typeerror FAILED
xarray/tests/test_plot.py::TestContourf::test_nonnumeric_index_raises_typeerror FAILED
xarray/tests/test_plot.py::TestContour::test_nonnumeric_index_raises_typeerror FAILED
xarray/tests/test_plot.py::TestPcolormesh::test_nonnumeric_index_raises_typeerror FAILED
xarray/tests/test_plot.py::TestImshow::test_nonnumeric_index_raises_typeerror FAIL

All of these check strings as input to the plot functions and expect a TypeError.

xarray.plot._right_dtype, however, uses numpy.issubdtype to check, if the types of the given data are subtypes of the now included netCDF4.datetime, which is interpreted as numpy.generic. And as numpy.generic is the highest type in Numpy's type hierarchy, xarray.plot._right_dtype now always returns True.

A possible way out would be to check for numeric types and datetime-like types separately.

@@ -32,7 +32,9 @@ def _ensure_plottable(*args):
Raise exception if there is anything in args that can't be plotted on
an axis.
"""
plottypes = [np.floating, np.integer, np.timedelta64, np.datetime64]
from netCDF4 import datetime as nc4datetime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be guarded behind a try/except clause, because netCDF4 is not a required dependency of xarray. Use from netcdftime import datetime instead of netCDF4 (@jhamman is moving netcdftime to its own package). Also, put import statements at the top of the file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should I do to prevent netcdftime.datetime to be understood as numpy.generic? Rewrite xarray.plot._ensure_plottable to handle Numpy types and netcdftime.datetime separately?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looks like we need a separate check for object dtype arrays (it's okay to loop over element).

@willirath
Copy link
Contributor Author

Does matplotlib actually plot netcdftime datetimes correctly?

Yes it does. They are just treated like datetime.datetime objects.

@willirath
Copy link
Contributor Author

willirath commented Feb 12, 2017

  • Add tests showing that netcdftime.datetime objects can be plotted (line and 2d plots).
  • Rewrite xarray.plot._ensure_plottable to only let numeric and date-like types pass.

@willirath
Copy link
Contributor Author

Does matplotlib actually plot netcdftime datetimes correctly?

Yes it does. They are just treated like datetime.datetime objects.

I was wrong here: netCDF4.datetime is datetime.datetime while (netcdftime.datetime is DatetimeProlepticGregorian.

And matplotlib fails on the latter: https://gist.github.com/willirath/48c04db0481e986fac0e44549e164ab0

@willirath
Copy link
Contributor Author

willirath commented Feb 13, 2017

a85b5c6 and e31aaab add a test plotting against a coordinate with netCDF4.datetime objects.

6f5e30f adds datetime.datetime objects to the list of plottable dtypes and splits the check into numpy dtypes (using numpy's type hierarchy) and all other types.

@willirath
Copy link
Contributor Author

netcdftime.datetime vs. datetime.datetime is discussed in Unidata/cftime#8. For now, I only added datetime.datetime to the list of plottable other types (see https://github.com/willirath/xarray/blob/6f5e30f3a2f8ca907f0fd91bcd571e7d98027896/xarray/plot/plot.py#L52).

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, though note that it's difficult to get datetime objects into xarray objects -- normally they are converted into datetime64.

# Warn then.
for t in numpy_types:
if np.issubdtype(np.generic, t):
warnings.warn('{} is treated as numpy.generic.'.format(t),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an internal, private function. Rather than warning let's issue a ValueError to ensure it doesn't happen by accident. It would also be safe to just use an assert here, since again it should never happen in user code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! I've changed it using assert.

@@ -1208,3 +1213,25 @@ def test_default_labels(self):
# Top row should be labeled
for label, ax in zip(self.darray.coords['col'].values, g.axes[0, :]):
self.assertTrue(substring_in_axes(label, ax))


@requires_netCDF4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test shouldn't be netCDF4 specific anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. I've changed it to use datetime and to not require netCDF4.


darray = DataArray(data, dims=['time'])
darray.coords['time'] = \
np.array([nc4datetime(2017, m, 1) for m in month])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, just use a normal datetime instance now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

data = np.sin(2 * np.pi * month / 12.0)

darray = DataArray(data, dims=['time'])
darray.coords['time'] = \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: avoid \ in favor of implicit line continuation with parentheses: https://www.python.org/dev/peps/pep-0008/#maximum-line-length

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- Use datetime instead of netCDF4.datetime.
- Do not warn but assert in private function.
@willirath
Copy link
Contributor Author

Yes, I know that xarray usually doesn't use datetime. For climate data past 2262 with a standard calendar, however, the dtypes of the time axis are falling back to datetime / netCDF4.datetime.

My last commit reflects what I'd consider ready to be pulled. Let me know what you think.

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops -- meant to post this yesterday but didn't hit submit on the review. Two minor suggestions, otherwise looks good to me.

"""
plottypes = [np.floating, np.integer, np.timedelta64, np.datetime64]
return all(any(isinstance(el, t) for t in types)
for el in np.ravel(x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: the line continuation here should start after the open parenthesis after all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. (Line was 79 chars long anyway.)


for x in args:
if not (_valid_numpy_subdtype(np.array(x), numpy_types)
or _valid_other_type(np.array(x), other_types)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably would be a good idea to ensure that x has object dtype (either here or in _valid_other_type) to avoid potential expensive loops, e.g., if dtype=complex but x is a very large.

Copy link
Contributor Author

@willirath willirath Mar 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a check to _valid_other_type which ensures np.issubdtype(np.generic, x.dtype) before looping over x. (See feb6afb)

@shoyer shoyer merged commit 83d2c2d into pydata:master Mar 9, 2017
@shoyer
Copy link
Member

shoyer commented Mar 9, 2017

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants