You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The args list help entries for the canned forecasters, including ones that don't use lags, is:
nafill_buffer: At predict time, recent values of the training data are
used to create a forecast. However, these can be 'NA' due to,
e.g., data latency issues. By default, any missing values
will get filled with less recent data. Setting this value to
'NULL' will result in 1 extra recent row (beyond those
required for lag creation) to be used. Note that we require
at least 'min(lags)' rows of recent data per 'geo_value' to
create a prediction. For this reason, setting 'nafill_buffer
< min(lags)' will be treated as _additional_ allowed recent
data rather than the total amount of recent data to examine.
This confused me, and I think there may be some implementation bug(s).
Requests:
State the default value, Inf, and what it does.
State what "regular" values (non-NULL, not < min(lags)) do.
"Setting this value to 'NULL' will result in 1 extra recent row (beyond those required for lag creation) to be used." ---
An extra row from where, for what? --- I think I understand now after a while of thinking about the name and looking at the implementation, but things could be clearer.
Why would this be expected to work so much that it's a special value?
It looks like "those required for lag creation" corresponds to max_lags + max_horizon, but factors into keep in x <- dplyr::filter(x, forecast_date - time_value <= keep), which
if forecast_date is the default max time_value, seems like more than is required for lag creation.
if forecast_date has been passed as the actual forecast date, makes it seem that max_horizon should be something around the reporting latency instead, although maybe this is off by one since keep is at minimum min_required + 1 not min_required.
(Also, this type of filter calls into question the naming n_recent.)
Actually, trying to use this, I get
out<- arx_forecaster(
case_death_rate_subset, "death_rate",
"death_rate",
args_list= arx_args_list(
ahead=16,
lags=7,
n_training=30,
nafill_buffer=NULL
)
)
#> Error in if (is.finite(nafill_buffer)) arg_is_pos_int(nafill_buffer, allow_null = TRUE) : #> argument is of length zero
("For this reason, setting 'nafill_buffer < min(lags)' will be treated as additional allowed recent data rather than the total amount of recent data to examine" --- still sounds like a footgun to have < min(lags) and >= min(lags) do different things.)
Note/remind that this NA filling is not used in the training process?
For forecasters that don't use lags, provide an explanation that doesn't mention lags?
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
The args list help entries for the canned forecasters, including ones that don't use lags, is:
This confused me, and I think there may be some implementation bug(s).
Requests:
max_lags + max_horizon
, but factors intokeep
inx <- dplyr::filter(x, forecast_date - time_value <= keep)
, whichforecast_date
is the default maxtime_value
, seems like more than is required for lag creation.forecast_date
has been passed as the actual forecast date, makes it seem thatmax_horizon
should be something around the reporting latency instead, although maybe this is off by one sincekeep
is at minimummin_required + 1
notmin_required
.n_recent
.)The text was updated successfully, but these errors were encountered: