Skip to content
This repository was archived by the owner on Sep 11, 2023. It is now read-only.
This repository was archived by the owner on Sep 11, 2023. It is now read-only.

Simplify the calculation of available datetimes across all DataSources #204

Closed
@JackKelly

Description

@JackKelly

How it's done now

To calculate the datetimes available across all the DataSources, NowcastingDataModule._get_datetimes() currently does something like this:

  1. Calls DataSource.datetime_index() on each DataSource. This function returns a list of all the available datetimes (in the DataSource's native sample period... e.g. 5 minutes for PV data; 30-minutes for GSP data)
  2. Interpolates 30 minute data to 5 minutes (using nowcasting_dataset.time.fill_30_minutes_timestamps_to_5_minutes())
  3. Finds the intersection of all these 5 minutely timestamps
  4. Calculates the t0 datetimes (using nowcasting_dataset.time.get_t0_datetimes().

Proposal for how to simplify this

The code could be simplified, and the execution time sped up, if we changed this to something like this:

  1. Implement a new function, DataSource.get_t0_datetimes(history_minutes, forecast_minutes), which would find all the contiguous sequences at least history_minutes + forecast_minutes long, and then returns all the t0 datetimes within those contiguous sequences. DataSources which are affected by sunlight (such as the PV and GSP DataSources) would only return datetimes in daylight.
  2. Then NowcastingDataModule._get_datetimes() has a much simpler job: It just has to call DataSource.get_t0_datetimes on each DataSource, and compute the intersection. That's it 🙂

Metadata

Metadata

Assignees

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions