Skip to content
This repository was archived by the owner on Sep 11, 2023. It is now read-only.

BUG: InvalidIndexError #42

Closed
JackKelly opened this issue Jul 5, 2021 · 11 comments
Closed

BUG: InvalidIndexError #42

JackKelly opened this issue Jul 5, 2021 · 11 comments

Comments

@JackKelly
Copy link
Member

DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
ERROR:nowcasting_dataset:Exception! start_hourly=2019-11-07 15:00:00, t0_hourly=2019-11-07 16:00:00, end_hourly=2019-11-07 16:00:00, target_times_hourly=DatetimeIndex(['2019-11-07 15:00:00', '2019-11-07 16:00:00'], dtype='datetime64[ns]', freq='H'), Reindexing only valid with uniquely valued Index objects, is_increasing=True, is_unique=True
Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 64, in _get_cached_time_slice
    return self._cache[t0_dt]
KeyError: Timestamp('2019-11-07 15:55:00')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/nwp_data_source.py", line 102, in _get_time_slice
    init_times = self.data.sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataarray.py", line 1271, in sel
    ds = self._to_temp_dataset().sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataset.py", line 2365, in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/coordinates.py", line 421, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 274, in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 200, in convert_label_indexer
    indexer = get_indexer_nd(index, label, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 101, in get_indexer_nd
    flat_indexer = index.get_indexer(flat_labels, method=method, tolerance=tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3442, in get_indexer
    raise InvalidIndexError(self._requires_unique_msg)
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
ERROR:nowcasting_dataset:Exception!  t0_dt=2019-11-07 15:55:00, x_meters_center=40000, y_meters_center=20000, Reindexing only valid with uniquely valued Index objects
Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 64, in _get_cached_time_slice
    return self._cache[t0_dt]
KeyError: Timestamp('2019-11-07 15:55:00')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 122, in _get_example
    example_from_source = data_source.get_example(
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 148, in get_example
    selected_data = self._get_cached_time_slice(t0_dt)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 66, in _get_cached_time_slice
    data = self._get_time_slice(t0_dt)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/nwp_data_source.py", line 102, in _get_time_slice
    init_times = self.data.sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataarray.py", line 1271, in sel
    ds = self._to_temp_dataset().sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataset.py", line 2365, in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/coordinates.py", line 421, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 274, in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 200, in convert_label_indexer
    indexer = get_indexer_nd(index, label, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 101, in get_indexer_nd
    flat_indexer = index.get_indexer(flat_labels, method=method, tolerance=tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3442, in get_indexer
    raise InvalidIndexError(self._requires_unique_msg)
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
ERROR:nowcasting_dataset:Exception! start_hourly=2019-09-30 13:00:00, t0_hourly=2019-09-30 14:00:00, end_hourly=2019-09-30 14:00:00, target_times_hourly=DatetimeIndex(['2019-09-30 13:00:00', '2019-09-30 14:00:00'], dtype='datetime64[ns]', freq='H'), Reindexing only valid with uniquely valued Index objects, is_increasing=True, is_unique=True
Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 64, in _get_cached_time_slice
    return self._cache[t0_dt]
KeyError: Timestamp('2019-09-30 13:45:00')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/nwp_data_source.py", line 102, in _get_time_slice
    init_times = self.data.sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataarray.py", line 1271, in sel
    ds = self._to_temp_dataset().sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataset.py", line 2365, in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/coordinates.py", line 421, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 274, in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 200, in convert_label_indexer
    indexer = get_indexer_nd(index, label, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 101, in get_indexer_nd
    flat_indexer = index.get_indexer(flat_labels, method=method, tolerance=tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3442, in get_indexer
    raise InvalidIndexError(self._requires_unique_msg)
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
ERROR:nowcasting_dataset:Exception!  t0_dt=2019-09-30 13:45:00, x_meters_center=40000, y_meters_center=250000, Reindexing only valid with uniquely valued Index objects
Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 64, in _get_cached_time_slice
    return self._cache[t0_dt]
KeyError: Timestamp('2019-09-30 13:45:00')
@JackKelly
Copy link
Member Author

Doesn't seem to happen when not using multiprocessing? Also can't seem to replicated in testing_NWPDataSource.

Maybe try older versions of Pandas?!?

@JackKelly
Copy link
Member Author

JackKelly commented Jul 5, 2021

Huh, it does happen with just 1 worker (with Pandas 1.2.5):

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name      | Type      | Params
----------------------------------------
0 | sat_conv1 | Conv2d    | 720   
1 | sat_conv2 | Conv2d    | 1.4 K 
2 | sat_conv3 | Conv2d    | 1.4 K 
3 | maxpool   | MaxPool2d | 0     
4 | fc1       | Linear    | 4.5 M 
5 | fc2       | Linear    | 38.0 K
6 | fc3       | Linear    | 16.5 K
7 | fc4       | Linear    | 16.5 K
8 | fc5       | Linear    | 129   
----------------------------------------
4.5 M     Trainable params
0         Non-trainable params
4.5 M     Total params
18.142    Total estimated model params size (MB)
Validation sanity check:   0%|                                                                  | 0/2 [00:00<?, ?it/s]
/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/data_loading.py:102: UserWarning: The dataloader, val dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 16 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448255797/work/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
Epoch 0: : 0it [00:00, ?it/s]                                                                                         
/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/data_loading.py:102: UserWarning: The dataloader, train dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 16 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
Epoch 0: : 1025it [1:00:38,  3.55s/it, loss=0.154, v_num=173]
Validating: 0it [00:00, ?it/s]
Validating: 0it [00:00, ?it/s]

DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
ERROR:nowcasting_dataset:Exception! start_hourly=2019-06-18 08:00:00, t0_hourly=2019-06-18 09:00:00, end_hourly=2019-06-18 09:00:00, target_times_hourly=DatetimeIndex(['2019-06-18 08:00:00', '2019-06-18 09:00:00'], dtype='datetime64[ns]', freq='H'), Reindexing only valid with uniquely valued Index objects, is_increasing=True, is_unique=True
Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 64, in _get_cached_time_slice
    return self._cache[t0_dt]
KeyError: Timestamp('2019-06-18 08:55:00')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/nwp_data_source.py", line 102, in _get_time_slice
    init_times = self.data.sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataarray.py", line 1271, in sel
    ds = self._to_temp_dataset().sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataset.py", line 2365, in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/coordinates.py", line 421, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 274, in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 200, in convert_label_indexer
    indexer = get_indexer_nd(index, label, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 101, in get_indexer_nd
    flat_indexer = index.get_indexer(flat_labels, method=method, tolerance=tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3172, in get_indexer
    raise InvalidIndexError(
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
ERROR:nowcasting_dataset:Exception!  t0_dt=2019-06-18 08:55:00, x_meters_center=364234.8279834602, y_meters_center=394940.1432101802, Reindexing only valid with uniquely valued Index objects
Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 64, in _get_cached_time_slice
    return self._cache[t0_dt]
KeyError: Timestamp('2019-06-18 08:55:00')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 122, in _get_example
    example_from_source = data_source.get_example(
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 148, in get_example
    selected_data = self._get_cached_time_slice(t0_dt)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 66, in _get_cached_time_slice
    data = self._get_time_slice(t0_dt)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/nwp_data_source.py", line 102, in _get_time_slice
    init_times = self.data.sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataarray.py", line 1271, in sel
    ds = self._to_temp_dataset().sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataset.py", line 2365, in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/coordinates.py", line 421, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 274, in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 200, in convert_label_indexer
    indexer = get_indexer_nd(index, label, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 101, in get_indexer_nd
    flat_indexer = index.get_indexer(flat_labels, method=method, tolerance=tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3172, in get_indexer
    raise InvalidIndexError(
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
---------------------------------------------------------------------------
InvalidIndexError                         Traceback (most recent call last)
/tmp/ipykernel_149807/1775107835.py in <module>
----> 1 trainer.fit(model, data_module)

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in fit(self, model, train_dataloader, val_dataloaders, datamodule)
    458         )
    459 
--> 460         self._run(model)
    461 
    462         assert self.state.stopped

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in _run(self, model)
    756 
    757         # dispatch `start_training` or `start_evaluating` or `start_predicting`
--> 758         self.dispatch()
    759 
    760         # plugin will finalized fitting (e.g. ddp_spawn will load trained model)

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in dispatch(self)
    797             self.accelerator.start_predicting(self)
    798         else:
--> 799             self.accelerator.start_training(self)
    800 
    801     def run_stage(self):

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py in start_training(self, trainer)
     94 
     95     def start_training(self, trainer: 'pl.Trainer') -> None:
---> 96         self.training_type_plugin.start_training(trainer)
     97 
     98     def start_evaluating(self, trainer: 'pl.Trainer') -> None:

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py in start_training(self, trainer)
    142     def start_training(self, trainer: 'pl.Trainer') -> None:
    143         # double dispatch to initiate the training loop
--> 144         self._results = trainer.run_stage()
    145 
    146     def start_evaluating(self, trainer: 'pl.Trainer') -> None:

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in run_stage(self)
    807         if self.predicting:
    808             return self.run_predict()
--> 809         return self.run_train()
    810 
    811     def _pre_training_routine(self):

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in run_train(self)
    869                 with self.profiler.profile("run_training_epoch"):
    870                     # run train epoch
--> 871                     self.train_loop.run_training_epoch()
    872 
    873                 if self.max_steps and self.max_steps <= self.global_step:

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/training_loop.py in run_training_epoch(self)
    582         if should_check_val:
    583             self.trainer.validating = True
--> 584             self.trainer.run_evaluation(on_epoch=True)
    585             self.trainer.training = True
    586 

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in run_evaluation(self, on_epoch)
    952             dl_max_batches = self.evaluation_loop.max_batches[dataloader_idx]
    953 
--> 954             for batch_idx, batch in enumerate(dataloader):
    955                 if batch is None:
    956                     continue

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/utils/data/dataloader.py in __next__(self)
    519             if self._sampler_iter is None:
    520                 self._reset()
--> 521             data = self._next_data()
    522             self._num_yielded += 1
    523             if self._dataset_kind == _DatasetKind.Iterable and \

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/utils/data/dataloader.py in _next_data(self)
   1201             else:
   1202                 del self._task_info[idx]
-> 1203                 return self._process_data(data)
   1204 
   1205     def _try_put_index(self):

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/utils/data/dataloader.py in _process_data(self, data)
   1227         self._try_put_index()
   1228         if isinstance(data, ExceptionWrapper):
-> 1229             data.reraise()
   1230         return data
   1231 

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/_utils.py in reraise(self)
    423             # have message field
    424             raise self.exc_type(message=msg)
--> 425         raise self.exc_type(msg)
    426 
    427 

InvalidIndexError: Caught InvalidIndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 64, in _get_cached_time_slice
    return self._cache[t0_dt]
KeyError: Timestamp('2019-06-18 08:55:00')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 34, in fetch
    data = next(self.dataset_iter)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 64, in __iter__
    yield self._get_batch()
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 84, in _get_batch
    examples = [
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 85, in <listcomp>
    future_example.result() for future_example in future_examples]
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/concurrent/futures/_base.py", line 445, in result
    return self.__get_result()
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
    raise self._exception
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 122, in _get_example
    example_from_source = data_source.get_example(
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 148, in get_example
    selected_data = self._get_cached_time_slice(t0_dt)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 66, in _get_cached_time_slice
    data = self._get_time_slice(t0_dt)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/nwp_data_source.py", line 102, in _get_time_slice
    init_times = self.data.sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataarray.py", line 1271, in sel
    ds = self._to_temp_dataset().sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataset.py", line 2365, in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/coordinates.py", line 421, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 274, in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 200, in convert_label_indexer
    indexer = get_indexer_nd(index, label, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 101, in get_indexer_nd
    flat_indexer = index.get_indexer(flat_labels, method=method, tolerance=tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3172, in get_indexer
    raise InvalidIndexError(
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

(after 1025 iterations, whilst validating I think???)

@JackKelly
Copy link
Member Author

Possibly related? pandas-dev/pandas#39882

@JackKelly
Copy link
Member Author

Try disabling the multi-threaded loop in dataset?

@JackKelly
Copy link
Member Author

JackKelly commented Jul 6, 2021

Error does still occur with a num_workers=1 (although it took 6 epochs to get this error!). With Pandas 1.3.0.

Epoch 6: : 1057it [1:14:19,  4.22s/it, loss=0.0793, v_num=178]
Epoch 7: : 0it [00:00, ?it/s, loss=0.0793, v_num=178]         
DEBUG:nowcasting_dataset:Opening satellite data: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/OSGB36/all_zarr_int16_single_timestep.zarr
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
DEBUG:nowcasting_dataset:Opening NWP data: gs://solar-pv-nowcasting-data/NWP/UK_Met_Office/UKV_zarr
ERROR:nowcasting_dataset:Exception! start_hourly=2019-01-16 14:00:00, t0_hourly=2019-01-16 15:00:00, end_hourly=2019-01-16 15:00:00, target_times_hourly=DatetimeIndex(['2019-01-16 14:00:00', '2019-01-16 15:00:00'], dtype='datetime64[ns]', freq='H'), Reindexing only valid with uniquely valued Index objects, is_increasing=True, is_unique=True
Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 64, in _get_cached_time_slice
    return self._cache[t0_dt]
KeyError: Timestamp('2019-01-16 14:15:00')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/nwp_data_source.py", line 124, in _get_time_slice
    init_times = self.data.sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataarray.py", line 1271, in sel
    ds = self._to_temp_dataset().sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataset.py", line 2365, in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/coordinates.py", line 421, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 274, in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 200, in convert_label_indexer
    indexer = get_indexer_nd(index, label, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 101, in get_indexer_nd
    flat_indexer = index.get_indexer(flat_labels, method=method, tolerance=tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3442, in get_indexer
    raise InvalidIndexError(self._requires_unique_msg)
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
ERROR:nowcasting_dataset:Exception!  t0_dt=2019-01-16 14:15:00, x_meters_center=552333.2498806263, y_meters_center=120659.03637486309, Reindexing only valid with uniquely valued Index objects
Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 64, in _get_cached_time_slice
    return self._cache[t0_dt]
KeyError: Timestamp('2019-01-16 14:15:00')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 122, in _get_example
    example_from_source = data_source.get_example(
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 148, in get_example
    selected_data = self._get_cached_time_slice(t0_dt)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 66, in _get_cached_time_slice
    data = self._get_time_slice(t0_dt)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/nwp_data_source.py", line 124, in _get_time_slice
    init_times = self.data.sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataarray.py", line 1271, in sel
    ds = self._to_temp_dataset().sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataset.py", line 2365, in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/coordinates.py", line 421, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 274, in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 200, in convert_label_indexer
    indexer = get_indexer_nd(index, label, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 101, in get_indexer_nd
    flat_indexer = index.get_indexer(flat_labels, method=method, tolerance=tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3442, in get_indexer
    raise InvalidIndexError(self._requires_unique_msg)
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
---------------------------------------------------------------------------
InvalidIndexError                         Traceback (most recent call last)
/tmp/ipykernel_273041/1775107835.py in <module>
----> 1 trainer.fit(model, data_module)

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in fit(self, model, train_dataloader, val_dataloaders, datamodule)
    458         )
    459 
--> 460         self._run(model)
    461 
    462         assert self.state.stopped

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in _run(self, model)
    756 
    757         # dispatch `start_training` or `start_evaluating` or `start_predicting`
--> 758         self.dispatch()
    759 
    760         # plugin will finalized fitting (e.g. ddp_spawn will load trained model)

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in dispatch(self)
    797             self.accelerator.start_predicting(self)
    798         else:
--> 799             self.accelerator.start_training(self)
    800 
    801     def run_stage(self):

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py in start_training(self, trainer)
     94 
     95     def start_training(self, trainer: 'pl.Trainer') -> None:
---> 96         self.training_type_plugin.start_training(trainer)
     97 
     98     def start_evaluating(self, trainer: 'pl.Trainer') -> None:

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py in start_training(self, trainer)
    142     def start_training(self, trainer: 'pl.Trainer') -> None:
    143         # double dispatch to initiate the training loop
--> 144         self._results = trainer.run_stage()
    145 
    146     def start_evaluating(self, trainer: 'pl.Trainer') -> None:

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in run_stage(self)
    807         if self.predicting:
    808             return self.run_predict()
--> 809         return self.run_train()
    810 
    811     def _pre_training_routine(self):

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py in run_train(self)
    869                 with self.profiler.profile("run_training_epoch"):
    870                     # run train epoch
--> 871                     self.train_loop.run_training_epoch()
    872 
    873                 if self.max_steps and self.max_steps <= self.global_step:

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/training_loop.py in run_training_epoch(self)
    489         is_last_batch = None
    490 
--> 491         for batch_idx, (batch, is_last_batch) in train_dataloader:
    492             self.trainer.batch_idx = batch_idx
    493             self.trainer.is_last_batch = is_last_batch

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/profiler/profilers.py in profile_iterable(self, iterable, action_name)
    110             try:
    111                 self.start(action_name)
--> 112                 value = next(iterator)
    113                 self.stop(action_name)
    114                 yield value

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/supporters.py in prefetch_iterator(iterable)
    528     try:
    529         # the iterator may be empty from the beginning
--> 530         last = next(it)
    531     except StopIteration:
    532         return

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/supporters.py in __next__(self)
    462 
    463         """
--> 464         return self.request_next_batch(self.loader_iters)
    465 
    466     @staticmethod

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/trainer/supporters.py in request_next_batch(loader_iters)
    476 
    477         """
--> 478         return apply_to_collection(loader_iters, Iterator, next)
    479 
    480     @staticmethod

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pytorch_lightning/utilities/apply_func.py in apply_to_collection(data, dtype, function, wrong_dtype, *args, **kwargs)
     83     # Breaking condition
     84     if isinstance(data, dtype) and (wrong_dtype is None or not isinstance(data, wrong_dtype)):
---> 85         return function(data, *args, **kwargs)
     86 
     87     # Recursively apply to collection items

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/utils/data/dataloader.py in __next__(self)
    519             if self._sampler_iter is None:
    520                 self._reset()
--> 521             data = self._next_data()
    522             self._num_yielded += 1
    523             if self._dataset_kind == _DatasetKind.Iterable and \

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/utils/data/dataloader.py in _next_data(self)
   1201             else:
   1202                 del self._task_info[idx]
-> 1203                 return self._process_data(data)
   1204 
   1205     def _try_put_index(self):

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/utils/data/dataloader.py in _process_data(self, data)
   1227         self._try_put_index()
   1228         if isinstance(data, ExceptionWrapper):
-> 1229             data.reraise()
   1230         return data
   1231 

~/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/_utils.py in reraise(self)
    423             # have message field
    424             raise self.exc_type(message=msg)
--> 425         raise self.exc_type(msg)
    426 
    427 

InvalidIndexError: Caught InvalidIndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 64, in _get_cached_time_slice
    return self._cache[t0_dt]
KeyError: Timestamp('2019-01-16 14:15:00')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 34, in fetch
    data = next(self.dataset_iter)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 64, in __iter__
    yield self._get_batch()
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 84, in _get_batch
    examples = [
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 85, in <listcomp>
    future_example.result() for future_example in future_examples]
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/concurrent/futures/_base.py", line 438, in result
    return self.__get_result()
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
    raise self._exception
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/dataset.py", line 122, in _get_example
    example_from_source = data_source.get_example(
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 148, in get_example
    selected_data = self._get_cached_time_slice(t0_dt)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/data_source.py", line 66, in _get_cached_time_slice
    data = self._get_time_slice(t0_dt)
  File "/home/jack/dev/ocf/nowcasting_dataset/nowcasting_dataset/data_sources/nwp_data_source.py", line 124, in _get_time_slice
    init_times = self.data.sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataarray.py", line 1271, in sel
    ds = self._to_temp_dataset().sel(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/dataset.py", line 2365, in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/coordinates.py", line 421, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 274, in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 200, in convert_label_indexer
    indexer = get_indexer_nd(index, label, method, tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/xarray/core/indexing.py", line 101, in get_indexer_nd
    flat_indexer = index.get_indexer(flat_labels, method=method, tolerance=tolerance)
  File "/home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3442, in get_indexer
    raise InvalidIndexError(self._requires_unique_msg)
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

@JackKelly
Copy link
Member Author

It also happens with a num_workers=0. Apparently "almost nothing is thread-safe in Pandas" (oh!)

This is a known issue: pandas-dev/pandas#21150

And here's the output of some debugging (which is only possible when using num_workers=0). Note that the InvalidIndexError is thrown if not self._index_as_unique but, during debugging, self._index_as_unique is True (so the state has changed!)

> /home/jack/miniconda3/envs/nowcasting_dataset/lib/python3.9/site-packages/pandas/core/indexes/base.py(3442)get_indexer()
   3440 
   3441         if not self._index_as_unique:
-> 3442             raise InvalidIndexError(self._requires_unique_msg)
   3443 
   3444         if not self._should_compare(target) and not is_interval_dtype(self.dtype):

ipdb>  target
DatetimeIndex(['2019-07-30 15:00:00', '2019-07-30 16:00:00'], dtype='datetime64[ns]', freq=None)
ipdb>  method
'pad'
ipdb>  limit
ipdb>  limit is None
True
ipdb>  tolerance is None
True
ipdb>  self
DatetimeIndex(['2018-01-01 00:00:00', '2018-01-01 03:00:00',
               '2018-01-01 06:00:00', '2018-01-01 09:00:00',
               '2018-01-01 12:00:00', '2018-01-01 15:00:00',
               '2018-01-01 18:00:00', '2018-01-01 21:00:00',
               '2018-01-02 00:00:00', '2018-01-02 03:00:00',
               ...
               '2019-12-30 18:00:00', '2019-12-30 21:00:00',
               '2019-12-31 00:00:00', '2019-12-31 03:00:00',
               '2019-12-31 06:00:00', '2019-12-31 09:00:00',
               '2019-12-31 12:00:00', '2019-12-31 15:00:00',
               '2019-12-31 18:00:00', '2019-12-31 21:00:00'],
              dtype='datetime64[ns]', name='init_time', length=5442, freq=None)
ipdb>  np.unique(self)
array(['2018-01-01T00:00:00.000000000', '2018-01-01T03:00:00.000000000',
       '2018-01-01T06:00:00.000000000', ...,
       '2019-12-31T15:00:00.000000000', '2019-12-31T18:00:00.000000000',
       '2019-12-31T21:00:00.000000000'], dtype='datetime64[ns]')
ipdb>  len(np.unique(self))
5442
ipdb>  self.isna()
array([False, False, False, ..., False, False, False])
ipdb>  self.isna().any()
False
ipdb>  self._index_as_unique
True
ipdb>  self.is_unique
True
ipdb>  self._engine
<pandas._libs.index.DatetimeEngine object at 0x7fcc2279b9f0>
ipdb>  self._engine.is_unique
True

@JackKelly JackKelly reopened this Jul 6, 2021
@JackKelly
Copy link
Member Author

JackKelly commented Jul 6, 2021

I re-wrote

init_times = nwp_ds.data.sel(init_time=target_times_hourly, method='ffill').init_time.values

in Numpy:

indexes = np.searchsorted(self.data.init_time, target_times_hourly, side='right')
indexes -= 1  # Because searchsorted returns the index _after_ the index we want.
init_times = self.data.init_time.values[indexes]

Which works! But now we're hitting the InvalidIndexError further down the code, at nwp_selected = self.data.sel(init_time=init_time_indexer, step=step_indexer)

some possible fixes:

  1. Simplify NWPDataSource._get_time_slice(). Just use the most recent init_time to start_dt.
  2. monkey-patch Pandas' Index.is_unique()
  3. select the NWP we want in single-threaded code. Then call DataArray.load() in multiple threads.
  4. Forget about using threads all together!
  5. Only use threads to load data from different DataSources. i.e. each DataSource only has 1 thread to worry about.
  6. Fundamentally change the approach (!) Pre-prepare dataset before hand.
    1. Or maybe just pre-prepare NWPs. e.g. just 4 pixels for each PV system location, all channels.
  7. Instead of getting a single example and a time from each DataSource, implement a DataSource.get_batch() and let each DataSource decide how best to thread (or not)!

@JackKelly
Copy link
Member Author

Try (5); and use the new NWP Zarr file, to see if that speeds things up. If not, try further shrinking the NWP Zarr.

@JackKelly
Copy link
Member Author

JackKelly commented Jul 6, 2021

With no threading, and loading NWPs, PV & Sat, getting 1.7 secs per iteration (which is horrible. Before adding NWPs, we were getting more like 20 it/s!)

Trying (5) (use one thread per DataSource)... UPDATE: Hmm, still gets same performance. I guess it's being swamped by reading enormous volumes of NWP data!

@JackKelly
Copy link
Member Author

Huh. Using smaller NWP Zarr doesn't help much.

OK. Let's go for (7). Then, in NWPDataSource, instead of 'manually' creating threads, we can lazily build the batch, and then call dask.compute() at the end of the batch. And, for SatelliteDataSource, we can go back to using threads for each example.

JackKelly added a commit that referenced this issue Jul 6, 2021
JackKelly added a commit that referenced this issue Jul 6, 2021
NWPDataSource lazily assembles its batch and then dask.compute() can
optimise the loading.

SatelliteDataSource doesn't use dask.  Instead it uses
ThreadPoolExecutor.

PVDataSource doesn't use dask or threads!  (Because the entire dataset
easily fits into memory!)

Tests pass.

Issues #38 #42 #20
@JackKelly
Copy link
Member Author

Yay! Code seems to work OK now. Still pretty slow with NWPs (2.7 it/s). But the fix is probably smaller NWP Zarr... #44 and #11

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant