Skip to content
This repository was archived by the owner on Sep 11, 2023. It is now read-only.

Ingest numerical weather prediction data (NWP) #3

Closed
5 tasks done
JackKelly opened this issue May 7, 2021 · 9 comments
Closed
5 tasks done

Ingest numerical weather prediction data (NWP) #3

JackKelly opened this issue May 7, 2021 · 9 comments

Comments

@JackKelly
Copy link
Member

JackKelly commented May 7, 2021

Use temperature at surface, precipitation, irradiance, cloud fraction, accumulated snow cover.

  • Finish NWPDataSource
  • Resample to 5-minutely
  • Standardise
  • Convert to float32
  • Plot timeseries data just before data goes into ML model
@JackKelly
Copy link
Member Author

todo: finish get_nwp_for_datetime() (just under PyTorch dataset heading)

@JackKelly
Copy link
Member Author

Largely done now. Need to finish off NWPDataSource.

If it's too slow to load then maybe just pre-load the temperature data for each PV system. But that's not very flexible. Probably better to find a good way to load in a separate thread

JackKelly referenced this issue in openclimatefix/predict_pv_yield May 10, 2021
@JackKelly
Copy link
Member Author

Resample to 5-minutely and interpolate, whilst making sure the resulting target_datetimes are correct. Need to make sure there's always data available at the start and end for the interpolation.

Maybe try resampling to 5 minutely as soon as we load the data. But prob won't work. So prob need to resample after creating a single target_datetime index for the NWPs.

Also need to throw an error if there isn't data for a particular datetime. And/or filter datetimes before training.

@JackKelly
Copy link
Member Author

Done with resampling just using xarray's built-in resampling :)

@JackKelly
Copy link
Member Author

Actually, no, using the 'simple approach' is an order of magnitude slower (1.8 seconds vs about 180 ms) and produces discontinuities between different NWP inits. I'll interpolate manually...

JackKelly referenced this issue in openclimatefix/predict_pv_yield May 11, 2021
…introduces discontinuities between NWP inits. #4
JackKelly referenced this issue in openclimatefix/predict_pv_yield May 11, 2021
@JackKelly
Copy link
Member Author

New approach for getting NWPs is way faster (2 ms)!

JackKelly referenced this issue in openclimatefix/predict_pv_yield May 11, 2021
@JackKelly
Copy link
Member Author

If the pesky for batch in dataloader: thing still doesn't work after restarting the kernel, probably need each worker to load the NWP Zarr itself.

@JackKelly
Copy link
Member Author

OK! The code runs past for batch in dataloader now. But now has a problem in get_nwp_example: nwp_selected = nwp.sel(init_time=init_time_indexer, step=step_indexer) where "not all values found in index 'step'". So probably need to pre-compute the correct datetimes; or handle the KeyError.

JackKelly referenced this issue in openclimatefix/predict_pv_yield May 11, 2021
@JackKelly
Copy link
Member Author

JackKelly commented May 12, 2021

Code appears to be running now; but is super-slow. (with 1 worker, it's now 15 seconds per iteration. It used to be more like 40 iterations per second!)

Still need to standardise and convert to float32

Ideas for speeding up in openclimatefix/predict_pv_yield_2#18

JackKelly referenced this issue in openclimatefix/predict_pv_yield May 12, 2021
JackKelly referenced this issue in openclimatefix/predict_pv_yield May 14, 2021
…Ps are still a little slow though (#24).  About to try resampling 'step' in load_single_chunk()
@JackKelly JackKelly transferred this issue from openclimatefix/predict_pv_yield Jun 10, 2021
JackKelly added a commit that referenced this issue Jul 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant