Description
I have encountered an insidious bug in the outputs of model.predict
. It seems that the coordinates of the output mean and std Datasets can be slightly different from those of the original Dataset passed as X_t
defining the target grid. In my case, I consistently get three values of longitude with a difference of 2e-6 from those of the X_t
dataset.
The problem is that apparently this is significant enough that when I compute operations between the two datasets (e.g. err_ds = mean_ds - truth_ds
) those coordinates are not seen as aligned and are therefore dropped altogether, resulting in err_ds
missing some columns. The bug does not raise any errors so it can be hard to notice the unexpected behaviour.
I've looked into it a bit and the problem originates due to the normalise-unnormalise steps, where going from lat/lon to x1/x2 and back results in these slightly different lat/lon than the originals. I guess it's a numerical approximation issue so not sure if it is fixable directly...
A workaround might be to use Dataset.assign_coords
with the original X_t
's coordinates at the end of model.predict
, instead of using data_processor.unnormalise
(if appropriate i.e. when resolution_factor == 1
).