-
Notifications
You must be signed in to change notification settings - Fork 52
ValueError: conflicting sizes for dimension with single item / asset #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Whoops, I just tried on In [9]: stackstac.stack([sitem.to_dict()], assets=["SR_B2"])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-9-d4cc13b0c565> in <module>
----> 1 stackstac.stack([sitem.to_dict()], assets=["SR_B2"])
~/src/gjoseph92/stackstac/stackstac/stack.py in stack(items, assets, epsg, resolution, bounds, bounds_latlon, snap_bounds, resampling, chunksize, dtype, fill_value, rescale, sortby_date, xy_coords, properties, band_coords, gdal_env, reader)
295 return xr.DataArray(
296 arr,
--> 297 *to_coords(
298 plain_items,
299 asset_ids,
~/src/gjoseph92/stackstac/stackstac/prepare.py in to_coords(items, asset_ids, spec, xy_coords, properties, band_coords)
347 dims = ["time", "band", "y", "x"]
348 coords = {
--> 349 "time": pd.to_datetime(
350 [item["properties"]["datetime"] for item in items],
351 infer_datetime_format=True,
~/miniconda3/envs/stackstac/lib/python3.8/site-packages/pandas/core/indexes/extension.py in method(self, *args, **kwargs)
76
77 def method(self, *args, **kwargs):
---> 78 result = attr(self._data, *args, **kwargs)
79 if wrap:
80 if isinstance(result, type(self._data)):
~/miniconda3/envs/stackstac/lib/python3.8/site-packages/pandas/core/arrays/datetimes.py in tz_convert(self, tz)
800 if self.tz is None:
801 # tz naive, use tz_localize
--> 802 raise TypeError(
803 "Cannot convert tz-naive timestamps, use tz_localize to localize"
804 )
TypeError: Cannot convert tz-naive timestamps, use tz_localize to localize ipdb> pp pd.to_datetime(items[0]['properties']['datetime'])
Timestamp('2020-09-08 18:55:51.575595+0000', tz='UTC') Seems to be the issue with |
With #27, the `tz_convert` would fail when we tripped over pandas-dev/pandas#41047, since the DatetimeIndex would be tz-naive. Now, we assume tz-naive datetimes are already in UTC. Addresses #33 (comment)
With #27, the `tz_convert` would fail when we tripped over pandas-dev/pandas#41047, since the DatetimeIndex would be tz-naive. Now, we assume tz-naive datetimes are already in UTC. Addresses #33 (comment)
I don't know why I didn't use `linspace` in the first place. With `arange` there were floating-point errors that could cause `ceil` to over/under-shoot by 1. Closes #33
Thanks Gabe, confirmed that this fixed it! |
@TomAugspurger the conflicting sizes issue should be fixed by #35. I thought I'd fixed it in #25, but that still had some issues with floating-point error. BTW, I think the off-by-one between stackstac's shape and the shape of the underlying dataset as reported by rasterio is actually partially due to an issue with the STAC metadata: import planetary_computer as pc
import stackstac
import requests
import pystac
import rasterio
import affine
r = requests.get(
"https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-8-c2-l2/items/LC08_L2SP_046027_20200908_20200919_02_T1"
)
item = pystac.Item.from_dict(r.json())
sitem = pc.sign_assets(item)
ds = rasterio.open(sitem.assets["SR_B2"].href)
stac_transform = Affine(*asset.properties["proj:transform"]) >>> ds.transform
Affine(30.0, 0.0, 472485.0,
0.0, -30.0, 5373615.0)
>>> stac_transform
Affine(29.996139492986746, 0.0, 472500.0,
0.0, -29.99619820048156, 5373600.0)
>>> ds.transform.xoff - stac_transform.xoff
-15.0
>>> ds.transform.yoff - stac_transform.yoff
15.0 It looks like in the STAC metadata, the asset is shifted half a pixel to the southeast. Additionally, the resolution reported by The bounds reported in STAC metadata also don't match the bounds of the GeoTIFF: >>> list(ds.bounds)
[472485.0, 5136885.0, 705615.0, 5373615.0]
>>> item.ext.projection.bbox
[472500.0, 5136900.0, 705600.0, 5373600.0]
>>> np.array(ds.bounds) - np.array(item.ext.projection.bbox)
array([-15., -15., 15., 15.]) It seems there's the half-pixel shift, plus in the STAC bounds, the asset is 30m (1px) smaller than in the GeoTIFF (which is probably why the resolution in the geotrans is slightly off): >>> ds_shape_m = (ds.bounds[2] - ds.bounds[0], ds.bounds[3] - ds.bounds[1])
>>> stac_bounds = item.ext.projection.bbox
>>> stac_shape_m = (stac_bounds[2] - stac_bounds[0], stac_bounds[3] - stac_bounds[1])
>>> np.array(ds_shape_m) - np.array(stac_shape_m)
array([30., 30.]) So I think stackstac is actually coming up with the right shape given the STAC metadata, but that seems slightly offset from the underlying data. |
Thanks for investigating that. @lossyrob, do you have any thoughts on #33 (comment)? I don't actually remember if these STAC items were produced by us / stactools, or if it came from USGS with the COGs. |
I haven't looked too closely at this, but I'm wondering if there's some issue with the resampling / resolution handling? In this example I have a single STAC item and I select a single asset from it. stackstac (actually xarray) complains that the sizes of the data and coordinates don't match. (this example requires
pip install planetary-computer
and pystac, but doesn't need an API token).The correct shape, according to rasterio, is
This could easily be user error :)
The text was updated successfully, but these errors were encountered: