-
Notifications
You must be signed in to change notification settings - Fork 208
[ENH] introduce revised version of ETS #2834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for contributing to
|
output of the numba identical and runtime timing for main and this version def numba_compare():
n =100
import time
import aeon.forecasting._ets as ets
import aeon.forecasting._numba_ets as etsfast
arr = np.random.rand(n)
slow_ets = ets.ETSForecaster()
fast_ets = etsfast.ETSForecaster()
slow_ets.fit(arr)
fast_ets.fit(arr)
a1 = slow_ets.predict()
a2 = fast_ets.predict()
print(f" Slow = {a1} fast = {a2}")
for n in range(100000,10000000,100000):
arr = np.random.rand(n)
ets_slow_time = time.time()
slow_ets.fit(arr)
ets_slow_time = time.time()-ets_slow_time
ets_fast_time = time.time()
fast_ets.fit(arr)
ets_fast_time = time.time()-ets_fast_time
ets_slow_pred = time.time()
a1=slow_ets.predict()
ets_slow_pred = time.time()-ets_slow_pred
ets_fast_pred = time.time()
a2=fast_ets.predict()
ets_fast_pred = time.time()-ets_fast_pred
equal = np.isclose(a1,a2, 4)
print(f"{n},{ets_slow_time}, {ets_fast_time},{ets_slow_pred},,{ets_fast_pred}"
f",{equal}") # noqa |
series length, train time, output the same <style> </style>
|
much faster than statsmodels version, as equivalent as we can make it def statsmodels_compare(setup_func, random_seed, catch_errors):
"""Run both our statsforecast and our implementation and crosschecks."""
import warnings
warnings.filterwarnings("ignore")
random.seed(random_seed)
(
y,
m,
error,
trendtype,
seasontype,
alpha,
beta,
gamma,
phi,
) = setup_func()
# tsml-eval implementation
start = time.perf_counter()
f1 = etsfast.ETSForecaster(
error,
trendtype,
seasontype,
m,
alpha,
beta,
gamma,
phi,
1,
)
for n in range(10000,1000000,10000):
y = np.random.rand(n)
start = time.perf_counter()
f1 = etsfast.ETSForecaster(
error,
trendtype,
seasontype,
m,
alpha,
beta,
gamma,
phi,
1,
)
f1.fit(y)
aeon_time = time.perf_counter()-start
from statsmodels.tsa.holtwinters import ExponentialSmoothing
start = time.perf_counter()
ets_model = ExponentialSmoothing(
y[m:],
trend="add" if trendtype == 1 else "mul" if trendtype == 2 else None,
damped_trend=(phi != 1 and trendtype != 0),
seasonal="add" if seasontype == 1 else "mul" if seasontype == 2 else None,
seasonal_periods=m if seasontype != 0 else None,
initialization_method="known",
initial_level=f1.level_,
initial_trend=f1.trend_ if trendtype != 0 else None,
initial_seasonal=f1.seasonality_ if seasontype != 0 else None,
)
results = ets_model.fit(
smoothing_level=alpha,
smoothing_trend=(
beta / alpha if trendtype != 0 else None
), # statsmodels uses beta*=beta/alpha
smoothing_seasonal=gamma if seasontype != 0 else None,
damping_trend=phi if trendtype != 0 else None,
optimized=False,
)
sm_time = time.perf_counter()-start
print(f"{n},{aeon_time}, {sm_time:0.20f}") # noqa |
length, aeon, statsmodels <style> </style>
|
it cannot be compared to the statsforecast model, because you cannot set alpha, beta, you can only auto tune them I did though, its much faster because it doesnt tune :) ``python def statsforecast_compare(setup_func, random_seed, catch_errors):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current version appears to use numba already? Not quite sure what the code changes are. It does not appear to be just numbafying what is current there at least.
aeon/forecasting/_ets.py
Outdated
) | ||
|
||
|
||
@njit(nogil=NOGIL, cache=CACHE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these vars? we dont use them anywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were in the statsforecast version that this was based on, and I think I was messing with them to try and get it to produce the same output. nogil is set to the default value, so isn't necessary really, the cache one is not though, and apparently it speeds up compilation times by using the previously compiled function if available, so probably good to leave in!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll sere what happens if I change them to our defaults
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switched to fasthmath in numba for consistency with the rest of the package, it doesn;t seem to make a difference. Those vars were just constant booleans at the top of the file, but I dont think we need them so have gone to just True and False, again to match the package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is any statsforecast code remaining? If so IMO it would be best to properly attribute it i.e. https://github.com/aeon-toolkit/aeon/pull/2748/files or att a note that this was inspired by that implementation.
@alexbanwell1 could you look at this PR and address @MatthewMiddlehurst comments? |
@MatthewMiddlehurst this is just a reboot from alex's big PR, overrides previous version |
I'm not sure which variables we should make private ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just going to trust the with the fit/transform changes. All experimental anyway. Few more parameter suggestions.
Re: attributes, whatever you want people to be able to access and are willing to document really 🙂. Don't think there are any strict rules
I have changed it so that you can input either strings or ints for error, trend and seasonality, these are now validated in fit and converted to ints for numba efficiency. Added a test of this conversion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great ! just some minor docs stuff from my side
Either NONE (0), ADDITIVE (1) or MULTIPLICATIVE (2). | ||
seasonality_type : int, default = 0 | ||
Either NONE (0), ADDITIVE (1) or MULTIPLICATIVE (2). | ||
error_type : string or in default='additive' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo "int" not "in"
Type of error model: 'additive' (0) or 'multiplicative' (1) | ||
trend_type : string, int or None, default=None | ||
Type of trend component: None (0), `additive' (1) or 'multiplicative' (2) | ||
seasonality_type : string or None, default=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string, int or None no ? not only string or None
trend_type: int = NONE, | ||
seasonality_type: int = NONE, | ||
error_type: Union[int, str] = 1, | ||
trend_type: Union[int, str, None] = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if defaults here are ints it should be the same as in docs, because now one is string and one is int
part of #2833
this replaces the current ETS with a numba version that is refactored. It is tested for equivalence, same tests pass and speed up over statsmodels is significant, cannot sensibly compare to statsforecast because that is tuning only, so obvs does more work, can compare that to autoETS
The first version was for some reason predicting an array rather than a float. Its still not clear to me how we use this with fit and predict, we might need a wrapper for an example. Basically fitting the model is so amazingly quick, refitting for each forecast is I believe worthwhile for interface compliance.