Forecaster and MVForecaster Attributes

  • You can look up what metrics and estimators are available for you to use by initiating a Forecaster or MVForecaster instance and checking the object’s attributes. This notebook shows how to do that with Forecaster only, but the same attributes exist in MVForecaster, unless specified otherwise.

[1]:
from scalecast.Forecaster import Forecaster
from scalecast.MVForecaster import MVForecaster

f = Forecaster(
    y = [1,2,3,4], # required
    current_dates = ['2021-01-01','2021-02-01','2021-03-01','2021-04-01'], # required, can be a numbered index if dates not known/needed
    future_dates = None, # optional. this accepts an int type that counts the forecast horizon steps. future dates can be generated after the object is initiated.
    test_length = 0, # default is 0, but this accepts int or float types to determine the number/fraction of obs to hold out for model testing
    cis = False, # default is False, change to True if you want confidence intervals. requires a test set.
    metrics = [
        'rmse', # default
        'mape', # default
        'mae', # default
        'r2', # default
        'smape',
        'mse',
        'abias',
    ],
)

Forecaster.estimators

  • These are the the models that forecast and can be set by using f.set_estimator(...).

  • They come from popular machine learning libraries like scikit-learn, keras, statsmodels, and others.

  • More estimators can be added, assuming they follow a basic sklearn API, by using the Forecaster.add_sklearn_estimator() function.

[2]:
print(*f.estimators,sep='\n')
Estimator(name='catboost', imported_model=<class 'catboost.core.CatBoostRegressor'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='elasticnet', imported_model=<class 'sklearn.linear_model._coordinate_descent.ElasticNet'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='gbt', imported_model=<class 'sklearn.ensemble._gb.GradientBoostingRegressor'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='knn', imported_model=<class 'sklearn.neighbors._regression.KNeighborsRegressor'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='lasso', imported_model=<class 'sklearn.linear_model._coordinate_descent.Lasso'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='mlp', imported_model=<class 'sklearn.neural_network._multilayer_perceptron.MLPRegressor'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='mlr', imported_model=<class 'sklearn.linear_model._base.LinearRegression'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='rf', imported_model=<class 'sklearn.ensemble._forest.RandomForestRegressor'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='ridge', imported_model=<class 'sklearn.linear_model._ridge.Ridge'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='sgd', imported_model=<class 'sklearn.linear_model._stochastic_gradient.SGDRegressor'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='svr', imported_model=<class 'sklearn.svm._classes.SVR'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='xgboost', imported_model=<class 'xgboost.sklearn.XGBRegressor'>, interpreted_model=<class 'scalecast.models.SKLearnUni'>)
Estimator(name='arima', imported_model='auto', interpreted_model=<class 'scalecast.models.ARIMA'>)
Estimator(name='hwes', imported_model='auto', interpreted_model=<class 'scalecast.models.HWES'>)
Estimator(name='prophet', imported_model='auto', interpreted_model=<class 'scalecast.models.Prophet'>)
Estimator(name='rnn', imported_model='auto', interpreted_model=<class 'scalecast.models.RNN'>)
Estimator(name='lstm', imported_model='auto', interpreted_model=<class 'scalecast.models.LSTM'>)
Estimator(name='naive', imported_model='auto', interpreted_model=<class 'scalecast.models.Naive'>)
Estimator(name='tbats', imported_model='auto', interpreted_model=<class 'scalecast.models.TBATS'>)
Estimator(name='theta', imported_model='auto', interpreted_model=<class 'scalecast.models.Theta'>)
Estimator(name='combo', imported_model='auto', interpreted_model=<class 'scalecast.models.Combo'>)

Forecaster.metrics

  • These are all the metrics available for use when optimizing models.

  • All metrics from the metrics class that accept only two arguments are available and can be passed when initiating the object or later using Forecaster.set_metrics().

  • Custom metrics and metric functions also accepted, as long as they only take two arguments (array of actuals and array of forecasted values).

[4]:
print(*f.metrics,sep='\n')
MetricStore(name='rmse', eval_func=<function Metrics.rmse at 0x12b6e79c0>, lower_is_better=True, min_obs_required=1)
MetricStore(name='mape', eval_func=<function Metrics.mape at 0x12af31940>, lower_is_better=True, min_obs_required=1)
MetricStore(name='mae', eval_func=<function Metrics.mae at 0x12b6e7a60>, lower_is_better=True, min_obs_required=1)
MetricStore(name='r2', eval_func=<function Metrics.r2 at 0x12b6e63e0>, lower_is_better=False, min_obs_required=2)
MetricStore(name='smape', eval_func=<function Metrics.smape at 0x12b6e7b00>, lower_is_better=True, min_obs_required=1)
MetricStore(name='mse', eval_func=<function Metrics.mse at 0x12b6e7880>, lower_is_better=True, min_obs_required=1)
MetricStore(name='abias', eval_func=<function Metrics.smape at 0x12b6e7b00>, lower_is_better=True, min_obs_required=1)

Forecaster.determine_best_by

  • These are generated from the metrics in Forecaster.metrics and include in-sample, test-set, and validation-set metrics.

  • Many functions can monitor one of these metrics when applying auto ML methods.

  • Plots and dataframe exports can be ordered best-to-worst according to any of these.

  • The difference between ‘Level’ and non-level only comes into play if Forecaster.diff() has been called to difference a series. However, SeriesTransformer also differences series, in addition to being able to take more dynamic transformations, making the need to use Forecaster.diff() irrelevant. It will soon go away and there will be no distinction between level and non-level metrics.

[5]:
print(*f.determine_best_by,sep='\n')
ValidationMetricValue
TestSetRMSE
TestSetMAPE
TestSetMAE
TestSetR2
TestSetSMAPE
TestSetMSE
TestSetABIAS
InSampleRMSE
InSampleMAPE
InSampleMAE
InSampleR2
InSampleSMAPE
InSampleMSE
InSampleABIAS

Forecaster.normalizer

  • These are all the options to scale your data when using an sklearn estimator.

  • All models receive a MinMax scale by default (since it is highly encouraged to always use scaled data for some scikit-learn models), but None is also available as an argument to avoid scaling.

[6]:
print(*f.normalizer.keys(),sep='\n')
minmax
normalize
scale
robust
None

MVForecaster.optimizer_funcs

  • These are the functions you can use to optimize models in MVForecaster only.

  • This means that if you use the "mean" option, which is the object’s default, when tuning models, it will choose the best one based on which metric had the best average performance on all series

  • You can add your own functions by calling add_optimizer_func(): see the docs.

[7]:
mvf = MVForecaster(f, f.copy())
print(*mvf.optimizer_funcs.keys(),sep='\n')
mean
min
max
[ ]: