This repository was archived by the owner on Sep 11, 2023. It is now read-only.
This repository was archived by the owner on Sep 11, 2023. It is now read-only.
Simplify the calculation of available datetimes across all DataSource
s #204
Closed
Description
How it's done now
To calculate the datetimes available across all the DataSources
, NowcastingDataModule._get_datetimes()
currently does something like this:
- Calls
DataSource.datetime_index()
on eachDataSource
. This function returns a list of all the available datetimes (in theDataSource
's native sample period... e.g. 5 minutes for PV data; 30-minutes for GSP data) - Interpolates 30 minute data to 5 minutes (using
nowcasting_dataset.time.fill_30_minutes_timestamps_to_5_minutes()
) - Finds the intersection of all these 5 minutely timestamps
- Calculates the
t0
datetimes (usingnowcasting_dataset.time.get_t0_datetimes()
.
Proposal for how to simplify this
The code could be simplified, and the execution time sped up, if we changed this to something like this:
- Implement a new function,
DataSource.get_t0_datetimes(history_minutes, forecast_minutes)
, which would find all the contiguous sequences at leasthistory_minutes + forecast_minutes
long, and then returns all thet0
datetimes within those contiguous sequences.DataSources
which are affected by sunlight (such as the PV and GSP DataSources) would only return datetimes in daylight. - Then
NowcastingDataModule._get_datetimes()
has a much simpler job: It just has to callDataSource.get_t0_datetimes
on eachDataSource
, and compute the intersection. That's it 🙂