Dissertation

This dissertation seeks to use satellite imagery to predict solar photovoltaic (PV) yield - the energy from solar panels, using satellite images. I combine two data sources: satellite images and PV energy readings.

The satellite images are from EUMETSAT, a satellite which takes an image of the northern third of the Meteosat disc every five minutes. The original dataset was around 100GB, so cropped to only include Devon (100x100x170K) dataset to make training easier.

I chose some PV sites in the area of the satellite image crop, in Devon, and used the PV readings from those locations from 2020 and 2021. I then matched this up with the satellite images.

I then had a look at the images and realised that a lot were dark

So I removed the ones where the solar altitude (the angle relative to the ground of the sun in the sky) was below 10 degrees - we can see that at least over short time scales solar altitude appears correlated with PV yield.

Models

I then tested a few different neural networks, mainly comparing CNNs and ConvLSTMs. I also tested taking inputs as: satellite images only, prior PV readings only, and prior PV and satellite images. In order to predict sequences of different lengths, I padded some of the batches.

def convlstm_padded(steps_forward, x_train_sat_padded, x_train_pv_inputs_padded, y_train, 
                                   x_test_sat_padded, x_test_pv_inputs_padded, y_test, 
                                   x_validation_sat_padded, x_validation_pv_inputs_padded, y_validation, 
                                   learning_rate, dropout_rate):
  # Hyperparameters

  learning_rate = learning_rate
  dropout_rate = 0.1 
  
  # ConvLSTM model 

  inp1 = layers.Input(shape=(x_train_sat_padded.shape[1:]), name='x_sat')
  x = layers.Masking(mask_value=0, input_shape=(24, 64, 64,1))(inp1)

  inp2 = layers.Input(shape=(x_train_pv_inputs_padded.shape[1:]), name='x_pv_input')
  y = layers.Masking(mask_value=-1, input_shape=(24, 1))(inp2)

  x = layers.ConvLSTM2D(filters=32, kernel_size=(3, 3), padding='same', recurrent_dropout=dropout_rate, return_sequences=True, activation="relu")(x)
  x = layers.MaxPooling3D(pool_size=(3,3,3))(x)

  x = layers.ConvLSTM2D(filters=32, kernel_size=(3,3), recurrent_dropout=dropout_rate, return_sequences=True, activation="relu") (x)
  x = layers.MaxPooling3D(pool_size=(3,3,3))(x)

  x = layers.ConvLSTM2D(filters=32, kernel_size=(3,3), recurrent_dropout=dropout_rate, return_sequences=False, activation="relu") (x)
  x = layers.MaxPooling2D(pool_size=(3,3))(x)

  x = layers.Flatten()(x)

  x = layers.Dense(64, activation="relu")(x)
  x = layers.Dropout(dropout_rate)(x)
  x = layers.Dense(64, activation="relu")(x)
  x = layers.Dropout(dropout_rate)(x)
  x = layers.Dense(64, activation="relu")(x)

  y = layers.LSTM(48)(y)

  z = layers.concatenate([x,y], axis=-1)
  z = layers.Dense(64)(z)
  z = layers.Dropout(dropout_rate)(z)
  z = layers.Dense(steps_forward, activation="linear")(z)

  model = keras.models.Model(inputs=[inp1, inp2], outputs=z)
  model.compile(
      loss='mse', metrics=[tf.keras.metrics.RootMeanSquaredError(), 'mean_absolute_error'], optimizer=keras.optimizers.Adam(learning_rate = learning_rate),
  )

Results

I found that the ConvLSMTs outperform CNNs over longer timescales, which makes sense but has been challenged by some of the literature, e.g. Bai et al, 2018.

Predictions

It was interesting to look at some of the predictions - shown in red. On the image on the left, it's good that they seem to encode that the view is getting brighter or darker - as they trend up or down. I think they probably tend to revert to the mean - on the sunnier day in the middle, the network keeps wanting to trend down.

Interpretability

I wondered what the neural networks are actually learning - so I visualised the activations of a neural network over a particularly bright image. The three images below show the three layers of convolutions. Each image within each grid is the activations from a single filter.

Some of the first layer appear to be recognising land, or sea, or maybe the edges. Then the second layer possibly builds up more general shapes, and the third appears to have highest activation around long edges or coastlines.

I wondered whether what I was getting was a model which just recognised Devon - because a clear image in relief would indicate a clear sky and not much PV, with high relief between land and sea indicating the sun is high in the sky. A next step is to test on bits of land with no coastlines/sea and to see what happens!

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
OCF_docs		OCF_docs
images		images
models		models
papers		papers
.gitignore		.gitignore
README.md		README.md
UCL_Dissertation_final.pdf		UCL_Dissertation_final.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dissertation

Models

Results

Predictions

Interpretability

About

Uh oh!

Releases

Packages

Languages

bndxn/dissertation

Folders and files

Latest commit

History

Repository files navigation

Dissertation

Models

Results

Predictions

Interpretability

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages