Forecasting Different Model Types

Use the following syntax to initiate and forecast with different model types:

from scalecast.Forecaster import Forecaster
from scalecast.models import ARIMA

f = Forecaster(y=df['y'], current_dates=df['dt'], future_dates=12)
f.init_estimator('arima',order=(1,1,1), seasonal_order=(1,0,0,12))
f.fit()
preds = f.predict()

See also auxmodels.

class scalecast.models.ARIMA(f: Forecaster, model: Literal['auto'] = 'auto', Xvars: XvarValues = None, test_set_actuals: list[float] | None = None, **kwargs: Any)

Forecasts using an ARIMA model from Statsmodels.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (str) – The ARIMA model to use. Default ‘auto’ which selects the ARIMA model from Statsmodels. Currently, ‘auto’ is the only option.

  • Xvars (list[str]) – List of regressors to use from the passed Forecaster object.

  • test_set_actuals (list[float]) – Optional. Test-set actuals to use for testing the model. This enables the dynamic_testing option.

  • **kwargs – Passed to the Statsmodels ARIMA model specified in model.

Methods:

fit(X, y, **fit_params)

Fits the estimator.

fit_predict(X, y)

Fits and predicts on the same dataset.

generate_current_X()

Returns the matrix of the current input exogenous variables.

generate_future_X()

Returns the matrix of the future input exogenous variables.

predict(X[, in_sample, dynamic])

Makes predictions.

fit(X: ndarray, y: ndarray, **fit_params: Any) Self

Fits the estimator.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals.

  • **fit_params – Passed to the .fit() method from the statsmodel model.

Returns:

Self

fit_predict(X: ndarray, y: ndarray) list[float]

Fits and predicts on the same dataset.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals.

Returns:

The predictions.

Return type:

list[float]

generate_current_X() ndarray

Returns the matrix of the current input exogenous variables. If no regressors specified using the Xvars paramater, returns None.

generate_future_X() ndarray

Returns the matrix of the future input exogenous variables. If no regressors specified using the Xvars paramater, returns None.

predict(X: ndarray, in_sample: bool = False, dynamic: bool | None = True, **predict_params: Any) list[float]

Makes predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • in_sample (bool) – Default False. If True, returns fitted values with a one-step ahead forecast.

  • dynamic (bool) – Default True. Only relevant if in_sample is True. Whether to use dynamic predictions when generating in-sample fitted values. If True, uses dynamic predictions, which means that when generating fitted values for the in-sample period, the model uses its own previous predictions as input rather than the actuals. If False, uses one-step ahead predictions, which means that when generating fitted values for the in-sample period, the model always uses the actuals from the previous time step as input rather than its own predictions. Using dynamic predictions can give a better sense of out-of-sample performance since it does not rely on actuals from the in-sample period, but it can also lead to worse performance since any mistakes the model makes are compounded in future predictions. Using one-step ahead predictions can give a better sense of in-sample fit since it always uses the actuals from the in-sample period, but it can also lead to overly optimistic performance since it relies on actuals from the in-sample period that would not be available in an out-of-sample forecasting scenario.

  • **predict_params – Passed to the estimator’s predict() method.

class scalecast.models.Combo(f: Forecaster, model: Literal['auto'] = 'auto', test_set_actuals: list[float] | None = None, how: Literal['simple', 'weighted'] = 'simple', models: ModelValues = 'all', determine_best_by: DetermineBestBy = 'ValidationMetricValue', weights: Sequence[float | int] | None = None, replace_negative_weights: bool | float = 0.001, exclude_models_with_no_fvs: bool = True)

Forecasts using a combination of other evaluated forecasts from multiple models. The forecasts are combined with either simple or weighted averaging.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (str) – The Combo model to use. Must be ‘auto’.

  • test_set_actuals (list[float]) – Optional. Test-set actuals to use for testing the model. This enables the dynamic_testing option.

  • how (str) – The method to use when combining forecasts. Default ‘simple’ which uses simple averaging. ‘weighted’ uses weighted averaging where the weights are determined by the relative performance of the models on some metric.

  • models (str or list[str]) – The models to include in the combination. Default ‘all’ which includes all models in the Forecaster’s history except the most recent one (which is assumed to be the Combo model itself). You can also specify a list of model names to include, or use the syntax ‘top_n’ to specify the top n models based on the metric specified in determine_best_by.

  • determine_best_by (str) – The metric to use when determining the best models for the ‘top_n’ syntax in the models argument. Default ‘ValidationMetricValue’ which uses the validation metric value stored in the Forecaster’s history for each model. This is the most common use-case, but you can also specify other metrics that are stored in the history such as ‘TestSetRMSE’ or ‘InSampleRMSE’.

  • weights (list[float]) – Optional. If how is ‘weighted’, you can optionally provide your own weights for each model instead of using the relative performance.

  • replace_negative_weights (bool|float) – Whether to replace negative-scoring metrics with some positive (or 0) value to avoid situations where predictions might become nonsensical. This will be ignored in situations where lower scores are better (R2 is the main use-case). Change this to False to turn it off. 0 is an acceptable replacement value.

  • exclude_models_with_no_fvs (bool) – Whether to exclude models that have no fitted values stored in the history when generating the combined forecast. This is relevant because if a model has no fitted values, it cannot generate in-sample predictions, which means it can only contribute to the future forecast and not the in-sample fitted values. This can lead to situations where the combined forecast is essentially just the forecast from that one model, which may not be desirable. Default True.

Methods:

fit([X, y])

Fits the estimator.

fit_predict([X, y])

Fits and predicts on the same dataset.

generate_current_X()

Generates the matrix of the current input dataset by extracting the fitted values for each model in the combination from the Forecaster's history.

generate_future_X()

Generates the matrix of the future input dataset by extracting either the test set predictions (if test_set_actuals were provided) or the forecasts for each model in the combination from the Forecaster's history.

predict([X, in_sample])

Makes predictions by combining the forecasts from the specified models using either simple or weighted averaging.

fit(X: None = None, y: None = None, **fit_params: None) Self

Fits the estimator. For the Combo model, this means specifying the weights to use for combining the forecasts from the specified models based on the method specified in how.

Parameters:
  • X (None) – Ignored for the Combo model since it does not use an input matrix for fitting. This is just for API consistency with other models.

  • y (None) – Ignored for the Combo model since it does not use an input matrix for fitting. This is just for API consistency with other models.

  • **fit_params – Ignored for the Combo model since there is no fitting process. This is just for API consistency with other models.

Returns:

Self

fit_predict(X: None = None, y: None = None) list[float]

Fits and predicts on the same dataset.

Parameters:
  • X (None) – Ignored for the Combo model since it does not use an input matrix for fitting. This is just for API consistency with other models.

  • y (None) – Ignored for the Combo model since it does not use an input matrix for fitting. This is just for API consistency with other models.

  • **fit_params – Ignored for the Combo model since there is no fitting process. This is just for API consistency with other models.

Returns:

The combined predictions.

Return type:

list[float]

generate_current_X()

Generates the matrix of the current input dataset by extracting the fitted values for each model in the combination from the Forecaster’s history.

generate_future_X()

Generates the matrix of the future input dataset by extracting either the test set predictions (if test_set_actuals were provided) or the forecasts for each model in the combination from the Forecaster’s history.

predict(X: None = None, in_sample: bool = False, **predict_params: None) list[float]

Makes predictions by combining the forecasts from the specified models using either simple or weighted averaging.

Parameters:
  • X (np.ndarray) – The input data. This is the matrix of forecasts from the specified models for the future periods, or the matrix of fitted values from the specified models for the in-sample period.

  • in_sample (bool) – Whether the predictions being generated are for the in-sample period (i.e. fitted values) or for the future forecast period. This is just for API consistency with other models since the Combo model generates predictions for both the in-sample period and the future forecast period in the same way by combining the forecasts from the specified models using either simple or weighted averaging.

  • **predict_params – Ignored for the Combo model since there is no fitting process. This is just for API consistency with other models.

Returns:

The combined predictions.

Return type:

list[float]

class scalecast.models.HWES(f: Forecaster, model: Literal['auto'] = 'auto', test_set_actuals: list[float] | None = None, **kwargs: Any)

Forecasts using a Holt-Winters Exponential Smoothing model from Statsmodels.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (str) – The HWES model to use. Default ‘auto’ which selects the ExponentialSmoothing model from Statsmodels. Currently, ‘auto’ is the only option, but more HWES variants may be added in the future.

  • test_set_actuals (list[float]) – Optional. Test-set actuals to use for testing the model. This enables the dynamic_testing option.

  • **kwargs – Passed to the Statsmodels ExponentialSmoothing model specified in model.

Methods:

fit(X, y[, optimized, use_brute])

Fits the estimator.

fit_predict(X, y)

Fits and predicts on the same dataset.

generate_current_X()

Placeholder method to remain consistent with other models.

generate_future_X()

Placeholder method to remain consistent with other models.

predict([X, in_sample])

Makes predictions.

fit(X: None, y: ndarray, optimized: bool = True, use_brute: bool = True, **fit_params: Any) Self

Fits the estimator.

Parameters:
  • X (np.ndarray) – The input data. Ignored for HWES models since HWES does not use an input matrix. This is just for API consistency with other models.

  • y (np.ndarray) – The observed actuals.

  • optimized (bool) – Default True. Whether to optimize the model’s smoothing level parameters. If False, the parameters will be set to the values passed in **kwargs or to the default values from the statsmodels ExponentialSmoothing model if not passed in **kwargs.

  • use_brute (bool) –

    Default True. Whether to use the brute-force optimization method when optimizing the model’s smoothing level parameters. This is passed to the fit() method from the statsmodels ExponentialSmoothing model and is only relevant if optimized is True.

    If False, the model will use the default optimization method from the statsmodels ExponentialSmoothing model when optimizing the smoothing level parameters.

  • **fit_params – Passed to the .fit() method from the statsmodels model.

Returns:

Self

fit_predict(X: None, y: ndarray) list[float]

Fits and predicts on the same dataset.

Parameters:
  • X (np.ndarray) – The input data. Ignored for HWES models since HWES does not use an input matrix. This is just for API consistency with other models.

  • y (np.ndarray) – The observed actuals.

Returns:

The predictions.

Return type:

list[float]

generate_current_X()

Placeholder method to remain consistent with other models. HWES does not use an input matrix, so this method does not need to do anything. It is only included for API consistency across models.

generate_future_X()

Placeholder method to remain consistent with other models. HWES does not use an input matrix, so this method does not need to do anything. It is only included for API consistency across models.

predict(X: None = None, in_sample: bool = False, **predict_params) list[float]

Makes predictions.

Parameters:
  • X (np.ndarray) – The input data. Ignored for HWES models since HWES does not use an input matrix. This is just for API consistency with other models.

  • in_sample (bool) – Default False. If True, returns fitted values with a one-step ahead forecast.

  • **predict_params – Passed to the estimator’s predict() method.

Returns:

The predictions.

Return type:

list[float]

class scalecast.models.LSTM(f: Forecaster, model: Literal['auto'] = 'auto', lags: PositiveInt = 1, normalizer: NormalizerLike = 'minmax', lstm_layer_sizes: Sequence[int] = [8], dropout: Sequence[float] = [0.0], loss: str = 'mean_absolute_error', activation: str = 'tanh', optimizer: str = 'Adam', learning_rate: ConfInterval = 0.001, random_seed: int | None = None, **kwargs: Any)

Forecasts using an LSTM model from Tensorflow. Inherits from RNN and simply sets the default layer to LSTM and adds lags by adding AR terms to the Forecaster object.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (str) – Default ‘auto’. ‘auto’ is the only accepted value.

  • lags (int) – The number of lags to add to the model. Default 1. This is added to the Forecaster object as AR terms, so the model will automatically use them as input features.

  • normalizer (NormalizerLike) – Default ‘minmax’. The label of the normalizer to use.

  • lstm_layer_sizes (list[int]) – Default [8]. The number of units in each LSTM layer. The number of layers is determined by the length of the list.

  • dropout (list[float]) – Default [0.0]. The dropout rate to use for each LSTM layer. Should be the same length as lstm_layer_sizes. If 0, no dropout is applied.

  • loss (str or tf.keras.losses.Loss) – Default ‘mean_absolute_error’.The loss function to minimize. See available options here: https://www.tensorflow.org/api_docs/python/tf/keras/losses. Be sure to choose one that is suitable for regression tasks.

  • optimizer (str or tf Optimizer) – Default “Adam”. The optimizer to use when compiling the model. See available values here: https://www.tensorflow.org/api_docs/python/tf/keras/optimizers. If str, it will use the optimizer with default args. If type Optimizer, will use the optimizer exactly as specified.

  • learning_rate (float) – Default 0.001. The learning rate to use when compiling the model. Ignored if you pass your own optimizer with a learning rate.

  • random_seed (int) – Optional. Set a seed for consistent results. With tensorflow networks, setting seeds does not guarantee consistent results.

  • **kwargs – Passed to fit() and can include epochs, verbose, callbacks, validation_split, and more.

Methods:

fit(X[, y])

Fits the estimator.

fit_predict(X, y)

Runs fit and predict methods, returning predictions.

generate_current_X()

Returns the matrix of the current input dataset.

generate_future_X()

Returns the matrix of the future input dataset.

predict(X[, in_sample])

Makes predictions.

fit(X: ndarray, y: None = None, **fit_params: Any) Self

Fits the estimator.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals. Ignored for RNN models since the actuals are already stored in self.y, which is used for training. This is just for API consistency with other models.

  • **fit_params – Passed to the .fit() method from the tensorflow model.

Returns:

Self

fit_predict(X, y) list[float]

Runs fit and predict methods, returning predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals. Ignored for RNN models since the actuals are already stored in self.y, which is used for training.

Returns:

The predictions.

Return type:

list[float]

generate_current_X() ndarray

Returns the matrix of the current input dataset.

generate_future_X() ndarray

Returns the matrix of the future input dataset.

predict(X: ndarray, in_sample: bool = False, **predict_params) list[float]

Makes predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • in_sample (bool) – Default False. If True, returns fitted values with a one-step ahead forecast.

  • **predict_params – Passed to the estimator’s predict() method.

Returns:

The predictions.

Return type:

list[float]

class scalecast.models.Naive(f: Forecaster, model: Literal['auto'] = 'auto', test_set_actuals: list[float] | None = None, seasonal: bool = False, m: int | Literal['auto'] = 'auto')

Forecasts using a Naive model. This model simply uses the last observed value as the forecast for all future periods. If seasonal, it uses the value from the same period in the previous season as the forecast.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (str) – The Naive model to use. Must be ‘auto’.

  • test_set_actuals (list[float]) – Not used

  • seasonal (bool) – Whether to use a seasonal naive model. If False, the forecast for all future periods is the last observed value. If True, the forecast for each future period is the value from the same period in the previous season.

  • m (int or 'auto') – The seasonal period to use if seasonal is True. If ‘auto’, the seasonal period is determined based on the frequency of the data. For example, if the frequency is monthly, the seasonal period will be 12. If the frequency is quarterly, the seasonal period will be 4. If the frequency is daily, the seasonal period will be 7. If the frequency is yearly, the seasonal period will be 1 (which means the seasonal naive model will be the same as the non-seasonal naive model). You can also specify an integer value for the seasonal period if you want to use a different seasonal period than the one determined by the frequency.

Methods:

fit([X, y])

Fits the estimator. For the Naive model, there is no fitting process since the model simply uses the last observed value (or the value from the same period in the previous season) as the forecast.

fit_predict([X, y])

Fits and predicts on the same dataset. For the Naive model, there is no fitting process since the model simply uses the last observed value (or the value from the same period in the previous season) as the forecast.

generate_current_X()

Placeholder method to remain consistent with other models.

generate_future_X()

Placeholder method to remain consistent with other models.

predict([X, in_sample])

Makes predictions.

fit(X: None = None, y: None = None) Self
Fits the estimator. For the Naive model, there is no fitting process since the model simply uses the last observed value (or the value from the same period in the previous season) as the forecast.

This method is included for API consistency with other models, but it does not need to do anything for the Naive model.

Parameters:
  • X (None) – Ignored for the Naive model since it does not use an input matrix. This is just for API consistency with other models.

  • y (None) – Ignored for the Naive model since it does not use an input matrix. This is just for API consistency with other models.

Returns:

Self

fit_predict(X: None = None, y: None = None) list[float]
Fits and predicts on the same dataset. For the Naive model, there is no fitting process since the model simply uses the last observed value (or the value from the same period in the previous season) as the forecast.

This method is included for API consistency with other models, but it does not need to do anything for the Naive model other than call the predict() method since there is no fitting process.

Parameters:
  • X (None) – Ignored for the Naive model since it does not use an input matrix. This is just for API consistency with other models.

  • y (None) – Ignored for the Naive model since it does not use an input matrix. This is just for API consistency with other models.

Returns:

The predictions.

Return type:

list[float]

generate_current_X() None

Placeholder method to remain consistent with other models. Naive does not use an input matrix, so this method does not need to do anything. It is only included for API consistency across models.

generate_future_X() None

Placeholder method to remain consistent with other models. Naive does not use an input matrix, so this method does not need to do anything. It is only included for API consistency across models.

predict(X: None = None, in_sample: bool = False) list[float]

Makes predictions. For the Naive model, the predictions are generated by taking the last observed value (or the value from the same period in the previous season) and using it as the forecast for all future periods.

Parameters:
  • X (None) – Ignored for the Naive model since it does not use an input matrix. This is just for API consistency with other models.

  • in_sample (bool) – Default False. If True, returns fitted values with a one-step ahead forecast. For the Naive model, the in-sample predictions are generated by taking the last observed value (or the value from the same period in the previous season) and using it as the fitted value for all periods in the in-s

Returns:

The predictions.

Return type:

list[float]

class scalecast.models.Prophet(f: Forecaster, model: Literal['auto'] = 'auto', Xvars: XvarValues = None, test_set_actuals: list[float] | None = None, cap: float | None = None, floor: float | None = None, callback_func: callable = None, **kwargs: Any)

Forecasts using a Prophet model from the prophet package.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (str) – The Prophet model to use. Default ‘auto’ which selects the Prophet model from the prophet package. Currently, ‘auto’ is the only option.

  • Xvars (list[str]) – List of regressors to use from the passed Forecaster object. These are added as extra regressors in the Prophet model. If ‘all’, will use all available regressors in the Forecaster object. Default None, which uses no regressors.

  • test_set_actuals (list[float]) – Optional. Test-set actuals to use for testing the model. This enables the dynamic_testing option.

  • cap (float) – Optional. The capacity parameter to use for the Prophet model if you want to fit a logistic growth model. If not passed, the model will fit a linear growth model.

  • floor (float) – Optional. The floor parameter to use for the Prophet model if you want to fit a logistic growth model. If not passed, the model will fit a linear growth model.

  • callback_func (callable) – Optional. A function that takes the initialized but unfitted Prophet model as input and performs some operations on it, such as adding holidays or changing hyperparameters, before it is fitted. This allows you to customize the Prophet model in ways that are not currently supported by the parameters of this class. If not passed, no operations will be performed on the initialized Prophet model before fitting.

  • **kwargs – Passed to the Prophet model specified in model. Note that if you want to use the dynamic_testing option with Prophet, you must pass the parameter ‘interval_width’ in **kwargs with a value less than 1 (e.g. 0.8) to ensure that the prediction intervals are narrow enough to be useful for testing.

Methods:

fit(X[, y])

Fits the estimator.

fit_predict(X[, y])

Fits and predicts on the same dataset.

generate_current_X()

Returns the DataFrame of the current input dataset.

generate_future_X()

Returns the DataFrame of the future input dataset.

predict(X[, in_sample])

Makes predictions.

fit(X: DataFrame, y: None = None, **fit_params: Any) Self

Fits the estimator.

Parameters:
  • X (pd.DataFrame) – The input data. y (pd.DataFrame): The observed actuals. Ignored for Prophet models since the actuals are already stored in self.current_actuals, which is used for training. This is just for API consistency with other models.

  • **fit_params – Passed to the .fit() method from the scikit-learn model.

Returns:

Self

fit_predict(X: DataFrame, y: None = None) list[float]

Fits and predicts on the same dataset.

Parameters:
  • X (pd.DataFrame) – The input data.

  • y (pd.DataFrame) – The observed actuals. Ignored for Prophet models since the actuals are already stored in self.current_actuals, which is used for training. This is just for API consistency with other models.

Returns:

The predictions.

Return type:

list[float]

generate_current_X() DataFrame

Returns the DataFrame of the current input dataset. For Prophet, this includes a ‘ds’ column for dates, a ‘y’ column for the actuals, and columns for any specified regressors. If cap and/or floor are specified, these are also included as columns.

generate_future_X() DataFrame

Returns the DataFrame of the future input dataset. For Prophet, this includes a ‘ds’ column for dates and columns for any specified regressors. If cap and/or floor are specified, these are also included as columns.

predict(X: DataFrame, in_sample: None = None, **predict_params: Any) list[float]

Makes predictions.

Parameters:
  • X (pd.DataFrame) – The input data.

  • in_sample (bool) – Ignored for Prophet models since Prophet does not have a built-in method for generating in-sample fitted values with a one-step ahead forecast. This is just for API consistency with other models.

  • **predict_params – Passed to the estimator’s predict() method.

Returns:

The predictions.

Return type:

list[float]

class scalecast.models.SKLearnMV(f: MVForecaster, model: ScikitLike, lags: None | int | list[int] | dict[str, int | list[int]] = 1, dynamic_testing: DynamicTesting = True, Xvars: XvarValues = 'all', normalizer: NormalizerLike = 'minmax', test_set_actuals: dict[str, list[float]] | None = None, **kwargs: Any)

Model class that supports any scikit-learn API estimator for multivariate forecasting.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (Scikit-learn API Estimator) – The imported scikit-learn API regression estimator/class (such as LinearRegressor or XGBRegressor).

  • lags (None or int or list[int] or dict[str,int or list[int]]) – The number of lags to add to the model. If int, that many lags added to every model. If a list of ints, only the lags in the list are added. If dict, key is a series name and value is int or list of ints that follows the behavior descrbied above, but only targeting passed series.

  • dynamic_testing (bool or int) – Whether to dynamically test the model or how many steps. Ignored when test_set_actuals not specified.

  • Xvars (list[str]) – List of regressors to use from the passed Forecaster object.

  • normalizer (NormalizerLike) – Default ‘minmax’. The label of the normalizer to use.

  • test_set_actuals (list[float]) – Optional. Test-set actuals to use for testing the model. This enables the dynamic_testing option.

  • **kwargs – Passed to the scikit-learn model passed to model.

Methods:

fit(X, y, **fit_params)

Fits the estimator.

fit_predict(X, y)

Runs fit and predict methods, returning predictions.

generate_current_X()

Returns the matrix of the current input dataset.

generate_future_X()

Returns the matrix of the future input dataset.

predict(X[, in_sample])

Makes predictions.

fit(X, y, **fit_params)

Fits the estimator.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals.

  • **fit_params – Passed to the .fit() method from the scikit-learn model.

Returns:

Self

fit_predict(X, y)

Runs fit and predict methods, returning predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals.

Returns:

The predictions.

Return type:

list[float]

generate_current_X() ndarray

Returns the matrix of the current input dataset.

generate_future_X() ndarray

Returns the matrix of the future input dataset.

predict(X, in_sample: bool = False, **predict_params) dict[str, list[float]]

Makes predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • in_sample (bool) – Default False. If True, returns fitted values with a one-step ahead forecast.

  • **predict_params – Passed to the estimator’s predict() method.

Returns:

The predictions.

Return type:

list[float]

class scalecast.models.SKLearnUni(f: Forecaster, model: ScikitLike, dynamic_testing: DynamicTesting = True, Xvars: XvarValues = None, normalizer: NormalizerLike = 'minmax', test_set_actuals: list[float] | None = None, **kwargs: Any)

Model class that supports any scikit-learn API estimator for univariate forecasting.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (Scikit-learn API Estimator) – The imported scikit-learn API regression estimator/class (such as LinearRegressor or XGBRegressor).

  • dynamic_testing (bool or int) – Whether to dynamically test the model or how many steps. Ignored when test_set_actuals not specified.

  • Xvars (list[str]) – List of regressors to use from the passed Forecaster object.

  • normalizer (NormalizerLike) – Default ‘minmax’. The label of the normalizer to use.

  • test_set_actuals (list[float]) – Optional. Test-set actuals to use for testing the model. This enables the dynamic_testing option.

  • **kwargs – Passed to the scikit-learn model passed to model.

Methods:

fit(X, y, **fit_params)

Fits the estimator.

fit_predict(X, y)

Runs fit and predict methods, returning predictions.

generate_current_X()

Returns the matrix of the current input dataset.

generate_future_X()

Returns the matrix of the future input dataset.

predict(X[, in_sample])

Makes predictions.

fit(X: ndarray, y: ndarray, **fit_params: Any) Self

Fits the estimator.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals.

  • **fit_params – Passed to the .fit() method from the scikit-learn model.

Returns:

Self

fit_predict(X: ndarray, y: ndarray) list[float]

Runs fit and predict methods, returning predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals.

Returns:

The predictions.

Return type:

list[float]

generate_current_X() ndarray

Returns the matrix of the current input dataset.

generate_future_X() ndarray

Returns the matrix of the future input dataset.

predict(X: ndarray, in_sample: bool = False, **predict_params: Any) list[float]

Makes predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • in_sample (bool) – Default False. If True, returns fitted values with a one-step ahead forecast.

  • **predict_params – Passed to the estimator’s predict() method.

Returns:

The predictions.

Return type:

list[float]

class scalecast.models.RNN(f: Forecaster, model: Literal['auto'] = 'auto', test_set_actuals: list[float] | None = None, Xvars: XvarValues = 'all', normalizer: NormalizerLike = 'minmax', layers_struct: list[tuple[dict[str, Any]]] = [('SimpleRNN', {'activation': 'tanh', 'units': 8})], loss: str = 'mean_absolute_error', optimizer: str = 'Adam', learning_rate: ConfInterval = 0.001, random_seed: int = None, scale_y: bool = True, **kwargs: Any)

Forecasts using a recurrent neural network model from Tensorflow.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (str) – Default ‘auto’. ‘auto’ is the only accepted value.

  • lags (None or int or list[int] or dict[str,int or list[int]]) – The number of lags to add to the model. If int, that many lags added to every model. If a list of ints, only the lags in the list are added. If dict, key is a series name and value is int or list of ints that follows the behavior descrbied above, but only targeting passed series.

  • Xvars (list[str]) – List of regressors to use from the passed Forecaster object.

  • normalizer (NormalizerLike) – Default ‘minmax’. The label of the normalizer to use.

  • test_set_actuals (list[float]) – Optional. Test-set actuals to use for testing the model. This enables the dynamic_testing option.

  • layers_struct (list[tuple[str,dict[str,Union[float,str]]]]) – Default [(‘SimpleRNN’,{‘units’:8,’activation’:’tanh’})]. Each element in the list is a tuple with two elements. First element of the list is the input layer (input_shape set automatically). First element of the tuple in the list is the type of layer (‘SimpleRNN’,’LSTM’, or ‘Dense’). Second element is a dict. In the dict, key is a str representing hyperparameter name: ‘units’,’activation’, etc. The value is the hyperparameter value. See here for options related to SimpleRNN: https://www.tensorflow.org/api_docs/python/tf/keras/layers/SimpleRNN. For LSTM: https://www.tensorflow.org/api_docs/python/tf/keras/layers/LSTM. For Dense: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense.

  • loss (str or tf.keras.losses.Loss) – Default ‘mean_absolute_error’. The loss function to minimize. See available options here: https://www.tensorflow.org/api_docs/python/tf/keras/losses. Be sure to choose one that is suitable for regression tasks.

  • optimizer (str or tf Optimizer) – Default “Adam”. The optimizer to use when compiling the model. See available values here: https://www.tensorflow.org/api_docs/python/tf/keras/optimizers. If str, it will use the optimizer with default args. If type Optimizer, will use the optimizer exactly as specified.

  • learning_rate (float) – Default 0.001. The learning rate to use when compiling the model. Ignored if you pass your own optimizer with a learning rate.

  • random_seed (int) – Optional. Set a seed for consistent results. With tensorflow networks, setting seeds does not guarantee consistent results.

  • scale_X (bool) – Default True. Whether to scale the exogenous inputs with the scaler passed to the normalizer paramater.

  • scale_y (bool) – Default True. Whether to scale the endogenous inputs (lags), as well as the model output, with the scaler passed to the normalizer paramater. The results will automatically return unscaled.

  • **kwargs – Passed to fit() and can include epochs, verbose, callbacks, validation_split, and more.

Methods:

fit(X[, y])

Fits the estimator.

fit_predict(X, y)

Runs fit and predict methods, returning predictions.

generate_current_X()

Returns the matrix of the current input dataset.

generate_future_X()

Returns the matrix of the future input dataset.

predict(X[, in_sample])

Makes predictions.

fit(X: ndarray, y: None = None, **fit_params: Any) Self

Fits the estimator.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals. Ignored for RNN models since the actuals are already stored in self.y, which is used for training. This is just for API consistency with other models.

  • **fit_params – Passed to the .fit() method from the tensorflow model.

Returns:

Self

fit_predict(X, y) list[float]

Runs fit and predict methods, returning predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals. Ignored for RNN models since the actuals are already stored in self.y, which is used for training.

Returns:

The predictions.

Return type:

list[float]

generate_current_X() ndarray

Returns the matrix of the current input dataset.

generate_future_X() ndarray

Returns the matrix of the future input dataset.

predict(X: ndarray, in_sample: bool = False, **predict_params) list[float]

Makes predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • in_sample (bool) – Default False. If True, returns fitted values with a one-step ahead forecast.

  • **predict_params – Passed to the estimator’s predict() method.

Returns:

The predictions.

Return type:

list[float]

class scalecast.models.TBATS(f: Forecaster, model: Literal['auto'] = 'auto', test_set_actuals: list[float] | None = None, random_seed: int | None = None, **kwargs: Any)

Forecasts using a TBATS model from the tbats package.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (str) – The TBATS model to use. Default ‘auto’ which selects the TBATS model from the tbats package. Currently, ‘auto’ is the only option, but more TBATS variants may be added in the future.

  • test_set_actuals (list[float]) – Optional. Test-set actuals to use for testing the model. This enables the dynamic_testing option.

  • random_seed (int) – Optional. Set a seed for consistent results. With TBATS models, setting seeds does not guarantee consistent results.

  • **kwargs – Passed to the TBATS model specified in model.

Methods:

fit(X[, y])

Fits the estimator.

fit_predict(X[, y])

Fits and predicts on the same dataset.

generate_current_X()

Returns the matrix of the current input dataset.

generate_future_X()

Returns the matrix of the future input dataset.

predict(X[, in_sample])

Makes predictions.

fit(X: ndarray, y: None = None, **fit_params: Any) Self

Fits the estimator.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals. Ignored for TBATS models since the actuals are already stored in self.current_actuals, which is used for training. This is just for API consistency with other models.

  • **fit_params – Passed to the .fit() method from the tbats model.

Returns:

Self

fit_predict(X: ndarray, y: None = None) list[float]

Fits and predicts on the same dataset.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals. Ignored for TBATS models since the actuals are already stored in self.current_actuals, which is used for training. This is just for API consistency with other models.

Returns:

The predictions.

Return type:

list[float]

generate_current_X() ndarray

Returns the matrix of the current input dataset.

generate_future_X() ndarray

Returns the matrix of the future input dataset. For TBATS, this is just an array of zeros equal to the length of the forecast.

predict(X: ndarray, in_sample: bool = False, **predict_params) list[float]

Makes predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • in_sample (bool) – Default False. If True, returns fitted values with a one-step ahead forecast.

  • **predict_params – Passed to the estimator’s predict() method.

Returns:

The predictions.

Return type:

list[float]

class scalecast.models.Theta(f: Forecaster, model: Literal['auto'] = 'auto', test_set_actuals: list[float] | None = None, **kwargs: Any)

Forecasts using a Theta model from Darts.

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (str) – The Theta model to use. Default ‘auto’ which selects the FourTheta model from Darts. Currently, ‘auto’ is the only option, but more Theta variants may be added in the future.

  • test_set_actuals (list[float]) – Optional. Test-set actuals to use for testing the model. This enables the dynamic_testing option.

  • **kwargs – Passed to the Darts Theta model specified in model.

Methods:

fit(X[, y])

Fits the estimator.

fit_predict(X[, y])

Fits and predicts on the same dataset.

generate_current_X()

Returns the matrix of the current input dataset.

generate_future_X()

Returns the matrix of the future input dataset.

predict(X[, in_sample])

Makes predictions.

fit(X: ndarray, y: None = None, **fit_params: Any) Self

Fits the estimator.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals. Ignored for Theta models since the actuals are already stored in self.current_actuals, which is used for training. This is just for API consistency with other models.

  • **fit_params – Passed to the .fit() method from the darts model.

Returns:

Self

fit_predict(X: ndarray, y: None = None) list[float]

Fits and predicts on the same dataset.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals. Ignored for Theta models since the actuals are already stored in self.current_actuals, which is used for training. This is just for API consistency with other models.

Returns:

The predictions.

Return type:

list[float]

generate_current_X() ndarray

Returns the matrix of the current input dataset.

generate_future_X() ndarray

Returns the matrix of the future input dataset.

predict(X: ndarray, in_sample: bool = False, **predict_params) list[float]

Makes predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • in_sample (bool) – Default False. If True, returns fitted values with a one-step ahead forecast.

  • **predict_params – Passed to the estimator’s predict() method.

Returns:

The predictions.

Return type:

list[float]

class scalecast.models.VECM(f: MVForecaster, model: Literal['auto'] = 'auto', Xvars: XvarValues = None, test_set_actuals: dict[str, list[float]] | None = None, lags: int = 1, coint_rank: int = 1, deterministic: Literal['n', 'co', 'ci', 'lo', 'li'] = 'n', seasons: int = 0, first_season: int = 0, **kwargs)

Forecasts using a vector error-correction model (multivariate forecaster).

Parameters:
  • f (Forecaster) – The Forecaster object storing the actual series and associated dates.

  • model (Scikit-learn API Estimator) – The imported scikit-learn API regression estimator/class (such as LinearRegressor or XGBRegressor).

  • lags (int) – The number of lags to add to the model. If int, that many lags added to every model. If a list of ints, only the lags in the list are added. If dict, key is a series name and value is int or list of ints that follows the behavior descrbied above, but only targeting passed series.

  • Xvars (list[str]) – List of exogenous regressors to use from the passed Forecaster object. If unspecified, no regressors are used.

  • test_set_actuals (list[float]) – Optional. Test-set actuals to use for testing the model. This enables the dynamic_testing option.

  • lags – The number of lags from each series to use in the model.

  • coint_rank (int) – Cointegration rank.

  • deterministic (str) – One of {“n”, “co”, “ci”, “lo”, “li”}. Default “n”. “n” - no deterministic terms. “co” - constant outside the cointegration relation. “ci” - constant within the cointegration relation. “lo” - linear trend outside the cointegration relation. “li” - linear trend within the cointegration relation. Combinations of these are possible (e.g. “cili” or “colo” for linear trend with intercept). When using a constant term you have to choose whether you want to restrict it to the cointegration relation (i.e. “ci”) or leave it unrestricted (i.e. “co”). Do not use both “ci” and “co”. The same applies for “li” and “lo” when using a linear term.

  • seasons (int) – Default 0. Number of periods in a seasonal cycle. 0 means no seasons.

  • first_season (int) – Default 0. Season of the first observation.

  • **kwargs – Passed to the scikit-learn model passed to model.

Methods:

fit(X, y, **fit_params)

Fits the estimator.

fit_predict(X, y)

Runs fit and predict methods, returning predictions.

generate_current_X()

Returns the matrix of the current input dataset.

generate_future_X()

Returns the matrix of the future input dataset.

predict(X[, in_sample])

Makes predictions.

fit(X: ndarray | None, y: ndarray, **fit_params)

Fits the estimator.

Parameters:
  • X (np.ndarray) – The exogenours input data. None is an accepted value.

  • y (np.ndarray) – The observed actuals.

  • **fit_params – Passed to the .fit() method from the scikit-learn model.

Returns:

Self

fit_predict(X, y)

Runs fit and predict methods, returning predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • y (np.ndarray) – The observed actuals.

Returns:

The predictions.

Return type:

list[float]

generate_current_X() ndarray

Returns the matrix of the current input dataset.

generate_future_X() ndarray

Returns the matrix of the future input dataset.

predict(X: ndarray, in_sample: bool = False, **predict_params)

Makes predictions.

Parameters:
  • X (np.ndarray) – The input data.

  • in_sample (bool) – Default False. If True, returns fitted values with a one-step ahead forecast.

  • **predict_params – Passed to the estimator’s predict() method.

Returns:

The predictions.

Return type:

list[float]