import pandas as pd
import datetime
import json
Backtesting with ML models
Let’s train regression models with different regression models.
Loading the data
Utilies and data libraries
Load the metadata
= open("./settings/settings.json")
raw_json = json.load(raw_json)
meta_json = meta_json["data"]["backtesting_path"] backtesting_path
Load the dataset and reformat it
= pd.read_csv("data/data.csv")
ts "ds"] = pd.to_datetime(ts["ds"])
ts[= ts.sort_values("ds")
ts = ts[["unique_id", "ds", "y"]]
ts ts.head()
unique_id | ds | y | |
---|---|---|---|
0 | 1 | 2022-11-11 23:00:00 | 456403 |
1 | 1 | 2022-11-12 00:00:00 | 458842 |
2 | 1 | 2022-11-12 01:00:00 | 455111 |
3 | 1 | 2022-11-12 02:00:00 | 448035 |
4 | 1 | 2022-11-12 03:00:00 | 438165 |
Set the data
# os.environ["NIXTLA_ID_AS_COL"] = "1"
from utilsforecast.plotting import plot_series
= "plotly").update_layout(height=300) plot_series(ts, engine
Set the Backtesting Process
In the following example, we will demonstrate how to set a simple backtesting process that train multiple machine learning models.
Let’s start by defining the models:
- Regression based on k-nearest neighbors (see model documentation)
- Multi-layer Perceptron regressor (see model documentation)
- ElasticNet - Linear regression with combined L1 and L2 priors as regularizer (see model documentation)
Note that we will set an initial higher max iteration for the Multi-layer Perceptron and ElasticNet models using the max_iter
argument:
from sklearn.linear_model import ElasticNet
from sklearn.neighbors import KNeighborsRegressor
from sklearn.neural_network import MLPRegressor
= {
ml_models "knn": KNeighborsRegressor(),
"mlp": MLPRegressor(max_iter=1000),
"enet": ElasticNet(max_iter=1000)
}
We will use the following features:
- Lags - lag 1 and 24
- Seasonal features - monthly, daily, day of the week, weekly
= [1, 24]
lags = ["month", "day", "dayofweek", "week", "hour"] date_features
Let’s define the forecast object:
from mlforecast import MLForecast
from mlforecast.utils import PredictionIntervals
= MLForecast(
mlf =ml_models,
models="h",
freq= lags,
lags = date_features
date_features )
Next, we will define the backtesting parameters. We will use a backtesting with four testing partitions, each testing partition with length of 72 hours, and overlapping of 12 hours between each partition. In adddion we will set a 95% prediction intervals using conformal distribution method:
= 72
h = 12
step_size = 4
partitions = 3
n_windows = "conformal_distribution"
method = PredictionIntervals(h=h, n_windows = n_windows, method = method)
pi = [95] levels
Let’s run the backtesting using the cross_validation
method:
= mlf.cross_validation(
bkt_df = ts,
df = step_size,
step_size=partitions,
n_windows=PredictionIntervals(n_windows=2, h=h),
prediction_intervals= levels,
level=h,
h=True,) fitted
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.475e+11, tolerance: 9.972e+09
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.601e+11, tolerance: 9.997e+09
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.552e+11, tolerance: 9.973e+09
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.639e+11, tolerance: 9.998e+09
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.564e+11, tolerance: 9.976e+09
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.636e+11, tolerance: 1.000e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.561e+11, tolerance: 9.976e+09
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.512e+11, tolerance: 1.000e+10
bkt_df.head()
unique_id | ds | cutoff | y | knn | mlp | enet | knn-lo-95 | knn-hi-95 | mlp-lo-95 | mlp-hi-95 | enet-lo-95 | enet-hi-95 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2024-11-26 12:00:00 | 2024-11-26 11:00:00 | 423022 | 416257.1875 | 409043.81250 | 413452.84375 | 408580.041406 | 423934.333594 | 396099.003906 | 421988.621094 | 402097.917187 | 424807.770313 |
1 | 1 | 2024-11-26 13:00:00 | 2024-11-26 11:00:00 | 446189 | 420906.1875 | 420555.43750 | 428518.75000 | 400963.112500 | 440849.262500 | 398524.904687 | 442585.970313 | 409926.205469 | 447111.294531 |
2 | 1 | 2024-11-26 14:00:00 | 2024-11-26 11:00:00 | 458839 | 427260.0000 | 430777.03125 | 441908.21875 | 408422.626563 | 446097.373437 | 408585.832813 | 452968.229687 | 424869.825781 | 458946.611719 |
3 | 1 | 2024-11-26 15:00:00 | 2024-11-26 11:00:00 | 464379 | 443909.8125 | 438137.00000 | 452244.65625 | 425663.901562 | 462155.723438 | 416472.705469 | 459801.294531 | 436616.064063 | 467873.248437 |
4 | 1 | 2024-11-26 16:00:00 | 2024-11-26 11:00:00 | 465544 | 453961.8125 | 443568.50000 | 460505.53125 | 430425.642188 | 477497.982812 | 414669.038281 | 472467.961719 | 437231.853125 | 483779.209375 |
For convenience reasons, we will map to the partition label their numeric order (as opposed to the timestamp):
::: {#cell-Partition mapping .cell execution_count=12}
= bkt_df["cutoff"].unique()
cutoff = pd.DataFrame({"cutoff": cutoff,
partitions_mapping "partition": range(1, len(cutoff) + 1)})
partitions_mapping.head()
cutoff | partition | |
---|---|---|
0 | 2024-11-26 11:00:00 | 1 |
1 | 2024-11-26 23:00:00 | 2 |
2 | 2024-11-27 11:00:00 | 3 |
3 | 2024-11-27 23:00:00 | 4 |
:::
Let’s merge it to the backtesting table:
= bkt_df.merge(partitions_mapping, how = "left", on = ["cutoff"]) bkt_df
Last but not least, let’s plot the results:
from plotly.subplots import make_subplots
import plotly.graph_objects as go
= bkt_df["partition"].unique()
partitions_labels
= ts[ts["ds"] > ts["ds"].max() - datetime.timedelta(hours = 24 * 7)]
ts_sub = make_subplots(rows=partitions, cols=1, subplot_titles= ["Partitions: " + str(i) for i in partitions_labels])
fig
= 1
r
for i in partitions_labels:
if r == 1:
= True
showlegend else:
= False
showlegend = bkt_df[bkt_df["partition"] == i]
bkt_sub = ts_sub["ds"], y=ts_sub["y"], legendgroup = "actual", showlegend = showlegend, mode='lines', name='Actual', line=dict(color='#023047', width=2)), row = r, col = 1)
fig.append_trace(go.Scatter(x=bkt_sub["ds"], y= bkt_sub["knn"], mode='lines', name='k-nearest neighbors', legendgroup = "knn", showlegend = showlegend, line=dict(color='#2a9d8f', width=1.5, dash = "dash")), row = r, col = 1)
fig.append_trace(go.Scatter(x=bkt_sub["ds"], y= bkt_sub["mlp"], mode='lines', name='Multi-layer Perceptron',legendgroup = "mlp", showlegend = showlegend, line=dict(color='#0077b6', width=1.5, dash = "dot")), row = r, col = 1)
fig.append_trace(go.Scatter(x=bkt_sub["ds"], y= bkt_sub["enet"], mode='lines', name='ElasticNet',legendgroup = "enet", showlegend = showlegend, line=dict(color='#ffc8dd', width=1.5, dash = "dot")), row = r, col = 1)
fig.append_trace(go.Scatter(x= r + 1
r
=600)
fig.update_layout(height fig.show()
Scoring the models
In this section, we will process the backtesting output and score the models. This includes the following steps: - Transition the backtesting dataframe from wide to long format - Calculate the models performance on each testing partition
We will use the following error metrics to evaluate the models performance: - MAPE - Mean Absolute Percentage Error - RMSE - Root Mean Square Error - Coverage - percentage of actual values that were within the prediction intervals range
Let’s reformat the data, transform the backtesting table - bkt_df
from wide to long.
We will use the melt
function transition the table into long format, where we assign the transform fields names into new column named model_label
and the corresponding values into the value
column:
= list(ml_models.keys())
models = pd.melt(
bkt_long
bkt_df,=["unique_id", "ds", "partition", "y"],
id_vars=models + [f"{model}-lo-95" for model in models] \
value_vars+ [f"{model}-hi-95" for model in models],
="model_label",
var_name="value",
value_name
)
bkt_long.head()
unique_id | ds | partition | y | model_label | value | |
---|---|---|---|---|---|---|
0 | 1 | 2024-11-26 12:00:00 | 1 | 423022 | knn | 416257.1875 |
1 | 1 | 2024-11-26 13:00:00 | 1 | 446189 | knn | 420906.1875 |
2 | 1 | 2024-11-26 14:00:00 | 1 | 458839 | knn | 427260.0000 |
3 | 1 | 2024-11-26 15:00:00 | 1 | 464379 | knn | 443909.8125 |
4 | 1 | 2024-11-26 16:00:00 | 1 | 465544 | knn | 453961.8125 |
We will use the following function to relabel the forecast and prediction intervals values into forecast
, lower
and upper
:
::: {#cell-Relabel the PI field .cell execution_count=16}
def split_model_confidence(model_name):
if "-lo-95" in model_name:
return model_name.replace("-lo-95", ""), "lower"
elif "-hi-95" in model_name:
return model_name.replace("-hi-95", ""), "upper"
else:
return model_name, "forecast"
"model_label"],\
bkt_long["type"] = zip(*bkt_long["model_label"].map(split_model_confidence))
bkt_long[ bkt_long.head()
unique_id | ds | partition | y | model_label | value | type | |
---|---|---|---|---|---|---|---|
0 | 1 | 2024-11-26 12:00:00 | 1 | 423022 | knn | 416257.1875 | forecast |
1 | 1 | 2024-11-26 13:00:00 | 1 | 446189 | knn | 420906.1875 | forecast |
2 | 1 | 2024-11-26 14:00:00 | 1 | 458839 | knn | 427260.0000 | forecast |
3 | 1 | 2024-11-26 15:00:00 | 1 | 464379 | knn | 443909.8125 | forecast |
4 | 1 | 2024-11-26 16:00:00 | 1 | 465544 | knn | 453961.8125 | forecast |
:::
In addition, we will map the model functions name to the labels we created and merge it later with the backtesting table:
= list(ml_models.keys())
model_label = [type(s).__name__ for s in list(ml_models.values())]
model_name
= pd.DataFrame({"model_label": model_label, "model_name": model_name})
models_mapping
models_mapping
model_label | model_name | |
---|---|---|
0 | knn | KNeighborsRegressor |
1 | mlp | MLPRegressor |
2 | enet | ElasticNet |
Next, let’s use the pivot
function to pivot the type
filed into three new fields and merge it with the partitions mapping table:
= (bkt_long
bkt = ["unique_id", "ds", "model_label","partition", "y"], columns = "type", values = "value")
.pivot(index
.reset_index()= "left", on = ["model_label"])
.merge(models_mapping, how
) bkt.head()
unique_id | ds | model_label | partition | y | forecast | lower | upper | model_name | |
---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2024-11-26 12:00:00 | enet | 1 | 423022 | 413452.84375 | 402097.917187 | 424807.770313 | ElasticNet |
1 | 1 | 2024-11-26 12:00:00 | knn | 1 | 423022 | 416257.18750 | 408580.041406 | 423934.333594 | KNeighborsRegressor |
2 | 1 | 2024-11-26 12:00:00 | mlp | 1 | 423022 | 409043.81250 | 396099.003906 | 421988.621094 | MLPRegressor |
3 | 1 | 2024-11-26 13:00:00 | enet | 1 | 446189 | 428518.75000 | 409926.205469 | 447111.294531 | ElasticNet |
4 | 1 | 2024-11-26 13:00:00 | knn | 1 | 446189 | 420906.18750 | 400963.112500 | 440849.262500 | KNeighborsRegressor |
Now we can score the models results using the following helpers functions:
from statistics import mean
def mape(y, yhat):
= mean(abs(y - yhat)/ y)
mape return mape
def rmse(y, yhat):
= (mean((y - yhat) ** 2 )) ** 0.5
rmse return rmse
def coverage(y, lower, upper):
= sum((y <= upper) & (y >= lower)) / len(y)
coverage return coverage
def score(df):
= mape(y = df["y"], yhat = df["forecast"])
mape_score = rmse(y = df["y"], yhat = df["forecast"])
rmse_score = coverage(y = df["y"], lower = df["lower"], upper = df["upper"])
coverage_score = ["mape","rmse", "coverage"]
cols = pd.Series([mape_score, rmse_score, coverage_score], index=cols)
d
return d
We will group by backtesting table by the series unique id, model label and partition and calculate its score:
= (bkt
score_df "unique_id", "model_label", "model_name", "partition"])[["unique_id", "model_label", "model_name", "partition", "y", "forecast", "lower", "upper"]]
.groupby([apply(score)
.
.reset_index()
)
score_df
unique_id | model_label | model_name | partition | mape | rmse | coverage | |
---|---|---|---|---|---|---|---|
0 | 1 | enet | ElasticNet | 1 | 0.046387 | 26248.918613 | 0.791667 |
1 | 1 | enet | ElasticNet | 2 | 0.040316 | 22284.199211 | 0.722222 |
2 | 1 | enet | ElasticNet | 3 | 0.048238 | 25473.070455 | 0.541667 |
3 | 1 | enet | ElasticNet | 4 | 0.056620 | 29758.941497 | 0.597222 |
4 | 1 | knn | KNeighborsRegressor | 1 | 0.071400 | 38516.048318 | 0.708333 |
5 | 1 | knn | KNeighborsRegressor | 2 | 0.065884 | 36684.848083 | 0.597222 |
6 | 1 | knn | KNeighborsRegressor | 3 | 0.039062 | 20619.197374 | 0.861111 |
7 | 1 | knn | KNeighborsRegressor | 4 | 0.046386 | 24855.016077 | 0.763889 |
8 | 1 | mlp | MLPRegressor | 1 | 0.051659 | 26748.692236 | 0.875000 |
9 | 1 | mlp | MLPRegressor | 2 | 0.049206 | 25506.360560 | 0.847222 |
10 | 1 | mlp | MLPRegressor | 3 | 0.042411 | 22529.148592 | 0.763889 |
11 | 1 | mlp | MLPRegressor | 4 | 0.051685 | 27958.964509 | 0.722222 |
Creating a leaderboard table:
= (bkt
leaderboard "unique_id", "model_label", "model_name"])[["unique_id", "model_label", "model_name", "partition", "y", "forecast", "lower", "upper"]]
.groupby([apply(score)
.
.reset_index()= "mape")
.sort_values(by
)
leaderboard
unique_id | model_label | model_name | mape | rmse | coverage | |
---|---|---|---|---|---|---|
0 | 1 | enet | ElasticNet | 0.047890 | 26077.112476 | 0.663194 |
2 | 1 | mlp | MLPRegressor | 0.048740 | 25764.962634 | 0.802083 |
1 | 1 | knn | KNeighborsRegressor | 0.055683 | 31113.450393 | 0.732639 |
Experimentation
Let’s now generalized the previous steps and set up an experimentation. The mean goal is to identify which model perform best. This includes identify the tuning parameters and features yield the best performance.
Let’s take the three models we used before and try different tuning parameters: - For the Multi-layer Perceptron regressor we will test different hidden layer size setting - For the ElasticNet model we will test different l1 ratio which defines the ration between L1 and L2 penalty
= {
ml_models "knn": KNeighborsRegressor(),
"mlp1": MLPRegressor(max_iter=2000, hidden_layer_sizes = (100,)),
"mlp2": MLPRegressor(max_iter=2000, hidden_layer_sizes = (50,)),
"mlp3": MLPRegressor(max_iter=2000, hidden_layer_sizes = (200,)),
"enet1": ElasticNet(max_iter=2000, l1_ratio = 0, tol=0.001),
"enet2": ElasticNet(max_iter=2000, l1_ratio = 0.5, tol=0.001),
"enet3": ElasticNet(max_iter=2000, l1_ratio = 1, tol=0.001),
}
We will use the same features settings as before:
= [1, 24]
lags = ["month", "day", "dayofweek", "week", "hour"] date_features
And the same backtesting settings:
= 72
h = 12
step_size = 4
partitions = 3
n_windows = "conformal_distribution"
method = PredictionIntervals(h=h, n_windows = n_windows, method = method)
pi = [95] levels
Let’s set the forecasting object:
= MLForecast(
mlf =ml_models,
models="h",
freq= lags,
lags = date_features
date_features )
And apply the backtesting:
= mlf.cross_validation(
bkt_df = ts,
df = step_size,
step_size=partitions,
n_windows=PredictionIntervals(n_windows=2, h=h),
prediction_intervals= levels,
level=h,
h=True,) fitted
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.717e+11, tolerance: 9.972e+10 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.518e+11, tolerance: 9.972e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.344e+11, tolerance: 9.972e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.781e+11, tolerance: 9.997e+10 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.616e+11, tolerance: 9.997e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.720e+11, tolerance: 9.973e+10 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.560e+11, tolerance: 9.973e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.395e+11, tolerance: 9.973e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.785e+11, tolerance: 9.998e+10 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.635e+11, tolerance: 9.998e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.458e+11, tolerance: 9.998e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.724e+11, tolerance: 9.976e+10 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.571e+11, tolerance: 9.976e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.341e+11, tolerance: 9.976e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.787e+11, tolerance: 1.000e+11 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.613e+11, tolerance: 1.000e+11
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.426e+11, tolerance: 1.000e+11
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.743e+11, tolerance: 9.976e+10 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.573e+11, tolerance: 9.976e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.337e+11, tolerance: 9.976e+10
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.790e+11, tolerance: 1.000e+11 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.549e+11, tolerance: 1.000e+11
/opt/forecasting-poc/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:697: ConvergenceWarning:
Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.474e+11, tolerance: 1.000e+11
We will use the following function to transform the backtesting object from wide to long:
def bkt_wide_2_long(bkt, models):
# Mapping the models labels
= list(models.keys())
model_labels = [type(s).__name__ for s in models.values()]
model_name = pd.DataFrame({"model_label": model_labels, "model_name": model_name})
models_mapping # Mapping the partitions
= bkt["cutoff"].unique()
cutoff = pd.DataFrame({"cutoff": cutoff,
partitions_mapping "partition": range(1, len(cutoff) + 1)})
= bkt.merge(partitions_mapping, how = "left", on = ["cutoff"])
bkt # Melting the bkt object to long
= pd.melt(
bkt_long
bkt,=["unique_id", "ds", "partition", "y"],
id_vars=model_labels + [f"{model}-lo-95" for model in model_labels] \
value_vars+ [f"{model}-hi-95" for model in model_labels],
="model_label",
var_name="value",
value_name
)
"model_label"],bkt_long["type"] = zip(*bkt_long["model_label"].map(split_model_confidence))
bkt_long[
= (bkt_long
bkt = ["unique_id", "ds", "model_label","partition", "y"], columns = "type", values = "value")
.pivot(index
.reset_index()= "left", on = ["model_label"])
.merge(models_mapping, how
)
return bkt
= bkt_wide_2_long(bkt = bkt_df, models = ml_models)
bkt
bkt.head()
unique_id | ds | model_label | partition | y | forecast | lower | upper | model_name | |
---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2024-11-26 12:00:00 | enet1 | 1 | 423022 | 412914.00000 | 401216.973438 | 424611.026562 | ElasticNet |
1 | 1 | 2024-11-26 12:00:00 | enet2 | 1 | 423022 | 413452.84375 | 402097.917187 | 424807.770313 | ElasticNet |
2 | 1 | 2024-11-26 12:00:00 | enet3 | 1 | 423022 | 414156.62500 | 403249.959375 | 425063.290625 | ElasticNet |
3 | 1 | 2024-11-26 12:00:00 | knn | 1 | 423022 | 416257.18750 | 408580.041406 | 423934.333594 | KNeighborsRegressor |
4 | 1 | 2024-11-26 12:00:00 | mlp1 | 1 | 423022 | 409429.62500 | 396350.746875 | 422508.503125 | MLPRegressor |
= (bkt
score_df "unique_id", "model_label", "model_name", "partition"])[["unique_id", "model_label", "model_name", "partition", "y", "forecast", "lower", "upper"]]
.groupby([apply(score)
.
.reset_index()
)
score_df
unique_id | model_label | model_name | partition | mape | rmse | coverage | |
---|---|---|---|---|---|---|---|
0 | 1 | enet1 | ElasticNet | 1 | 0.046022 | 25850.656158 | 0.805556 |
1 | 1 | enet1 | ElasticNet | 2 | 0.040436 | 22229.766257 | 0.763889 |
2 | 1 | enet1 | ElasticNet | 3 | 0.047834 | 25163.856352 | 0.569444 |
3 | 1 | enet1 | ElasticNet | 4 | 0.056144 | 29338.262505 | 0.569444 |
4 | 1 | enet2 | ElasticNet | 1 | 0.046387 | 26248.881182 | 0.791667 |
5 | 1 | enet2 | ElasticNet | 2 | 0.040316 | 22284.082250 | 0.722222 |
6 | 1 | enet2 | ElasticNet | 3 | 0.048238 | 25473.073322 | 0.541667 |
7 | 1 | enet2 | ElasticNet | 4 | 0.056620 | 29758.980748 | 0.597222 |
8 | 1 | enet3 | ElasticNet | 1 | 0.046769 | 26734.650723 | 0.750000 |
9 | 1 | enet3 | ElasticNet | 2 | 0.040188 | 22320.745610 | 0.680556 |
10 | 1 | enet3 | ElasticNet | 3 | 0.048707 | 25853.755042 | 0.555556 |
11 | 1 | enet3 | ElasticNet | 4 | 0.057163 | 30287.218437 | 0.569444 |
12 | 1 | knn | KNeighborsRegressor | 1 | 0.071400 | 38516.048318 | 0.708333 |
13 | 1 | knn | KNeighborsRegressor | 2 | 0.065884 | 36684.848083 | 0.597222 |
14 | 1 | knn | KNeighborsRegressor | 3 | 0.039062 | 20619.197374 | 0.861111 |
15 | 1 | knn | KNeighborsRegressor | 4 | 0.046386 | 24855.016077 | 0.763889 |
16 | 1 | mlp1 | MLPRegressor | 1 | 0.052514 | 27242.245521 | 0.861111 |
17 | 1 | mlp1 | MLPRegressor | 2 | 0.048664 | 24604.639462 | 0.847222 |
18 | 1 | mlp1 | MLPRegressor | 3 | 0.047530 | 23878.789059 | 0.750000 |
19 | 1 | mlp1 | MLPRegressor | 4 | 0.049799 | 27118.050934 | 0.680556 |
20 | 1 | mlp2 | MLPRegressor | 1 | 0.059734 | 32490.520613 | 0.722222 |
21 | 1 | mlp2 | MLPRegressor | 2 | 0.051235 | 27731.687214 | 0.763889 |
22 | 1 | mlp2 | MLPRegressor | 3 | 0.079752 | 42953.540592 | 0.652778 |
23 | 1 | mlp2 | MLPRegressor | 4 | 0.050840 | 27427.795038 | 0.722222 |
24 | 1 | mlp3 | MLPRegressor | 1 | 0.050637 | 26258.505875 | 0.875000 |
25 | 1 | mlp3 | MLPRegressor | 2 | 0.049389 | 25730.788109 | 0.750000 |
26 | 1 | mlp3 | MLPRegressor | 3 | 0.051717 | 27861.066305 | 0.736111 |
27 | 1 | mlp3 | MLPRegressor | 4 | 0.049857 | 26739.627054 | 0.666667 |
Logging the Results to MLflow
Let’s load the MLflow library and define the experiment name:
import mlflow
import datetime
= "ml_forecast_exp01"
experiment_name = "file:///mlruns" mlflow_path
We will log the backtesting parameters at tag:
try:
= experiment_name,
mlflow.create_experiment(name = mlflow_path,
artifact_location= tags)
tags = mlflow.get_experiment_by_name(experiment_name)
meta print(f"Set a new experiment {experiment_name}")
print("Pulling the metadata")
except:
print(f"Experiment {experiment_name} exists, pulling the metadata")
= mlflow.get_experiment_by_name(experiment_name) meta
Set a new experiment ml_forecast_exp01
Pulling the metadata
= datetime.datetime.now().strftime("%Y-%m-%d %H-%M-%S") run_time
for index, row in score_df.iterrows():
= row["model_label"] + "-" + run_time
run_name with mlflow.start_run(experiment_id = meta.experiment_id, run_name = run_name,
= {"type": "backtesting","partition": row["partition"],
tags "unique_id": row["unique_id"],"model_label": row["model_label"],
"model_name": row["model_name"],"run_name": run_name}) as run:
= ml_models[row["model_label"]].get_params()
model_params "model_name"] = row["model_name"]
model_params["model_label"] = row["model_label"]
model_params["partition"] = row["partition"]
model_params["lags"] = lags
model_params["date_features"] = date_features
model_params[
mlflow.log_params(model_params)"mape", row["mape"])
mlflow.log_metric("rmse", row["rmse"])
mlflow.log_metric("coverage", row["coverage"]) mlflow.log_metric(
= mlflow.search_runs(experiment_ids=[meta.experiment_id], order_by=["metrics.mape"])
results
results.head()
run_id | experiment_id | status | artifact_uri | start_time | end_time | metrics.coverage | metrics.mape | metrics.rmse | params.partition | ... | tags.partition | tags.type | tags.mlflow.source.name | tags.mlflow.runName | tags.run_name | tags.unique_id | tags.mlflow.source.type | tags.mlflow.user | tags.model_name | tags.model_label | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3b6c0f1d216a4914a76d0e5c0f90e316 | 796115560103651146 | FINISHED | file:///mlruns/3b6c0f1d216a4914a76d0e5c0f90e31... | 2025-01-12 20:45:05.917000+00:00 | 2025-01-12 20:45:06.077000+00:00 | 0.861111 | 0.039062 | 20619.197374 | 3 | ... | 3 | backtesting | /opt/forecasting-poc/lib/python3.10/site-packa... | knn-2025-01-12 20-45-03 | knn-2025-01-12 20-45-03 | 1 | LOCAL | root | KNeighborsRegressor | knn |
1 | 80302de0c1954afdaf121afb909e7788 | 796115560103651146 | FINISHED | file:///mlruns/80302de0c1954afdaf121afb909e778... | 2025-01-12 20:45:05.039000+00:00 | 2025-01-12 20:45:05.200000+00:00 | 0.680556 | 0.040188 | 22320.745610 | 2 | ... | 2 | backtesting | /opt/forecasting-poc/lib/python3.10/site-packa... | enet3-2025-01-12 20-45-03 | enet3-2025-01-12 20-45-03 | 1 | LOCAL | root | ElasticNet | enet3 |
2 | 73611b1476b3406c84ceaf8f8940e851 | 796115560103651146 | FINISHED | file:///mlruns/73611b1476b3406c84ceaf8f8940e85... | 2025-01-12 20:45:04.300000+00:00 | 2025-01-12 20:45:04.484000+00:00 | 0.722222 | 0.040316 | 22284.082250 | 2 | ... | 2 | backtesting | /opt/forecasting-poc/lib/python3.10/site-packa... | enet2-2025-01-12 20-45-03 | enet2-2025-01-12 20-45-03 | 1 | LOCAL | root | ElasticNet | enet2 |
3 | 302336a224ff4683a8e174920aa1577c | 796115560103651146 | FINISHED | file:///mlruns/302336a224ff4683a8e174920aa1577... | 2025-01-12 20:45:03.560000+00:00 | 2025-01-12 20:45:03.729000+00:00 | 0.763889 | 0.040436 | 22229.766257 | 2 | ... | 2 | backtesting | /opt/forecasting-poc/lib/python3.10/site-packa... | enet1-2025-01-12 20-45-03 | enet1-2025-01-12 20-45-03 | 1 | LOCAL | root | ElasticNet | enet1 |
4 | aebebe58489447548066582943d6b64a | 796115560103651146 | FINISHED | file:///mlruns/aebebe58489447548066582943d6b64... | 2025-01-12 20:45:03.382000+00:00 | 2025-01-12 20:45:03.552000+00:00 | 0.805556 | 0.046022 | 25850.656158 | 1 | ... | 1 | backtesting | /opt/forecasting-poc/lib/python3.10/site-packa... | enet1-2025-01-12 20-45-03 | enet1-2025-01-12 20-45-03 | 1 | LOCAL | root | ElasticNet | enet1 |
5 rows × 61 columns
Plot error distribution:
import plotly.express as px
= px.box(x= results["tags.model_label"], y= 100 * results["metrics.mape"], color= results["tags.model_name"])
fig
# Add jitter
='all', jitter=0.3, pointpos=-2)
fig.update_traces(boxpoints
fig.update_layout(= "Models Error Distribution",
title = "Models Family",
legend_title_text ="Model Label",
xaxis_title="MAPE (%)"
yaxis_title )
Identify best model:
= (results.
leaderboard "experiment_id", "status", "tags.model_label", "tags.model_name"])["metrics.mape"]
groupby([
.mean()
.reset_index()= "metrics.mape")
.sort_values(by
)
leaderboard
experiment_id | status | tags.model_label | tags.model_name | metrics.mape | |
---|---|---|---|---|---|
0 | 796115560103651146 | FINISHED | enet1 | ElasticNet | 0.047609 |
1 | 796115560103651146 | FINISHED | enet2 | ElasticNet | 0.047890 |
2 | 796115560103651146 | FINISHED | enet3 | ElasticNet | 0.048207 |
4 | 796115560103651146 | FINISHED | mlp1 | MLPRegressor | 0.049627 |
6 | 796115560103651146 | FINISHED | mlp3 | MLPRegressor | 0.050400 |
3 | 796115560103651146 | FINISHED | knn | KNeighborsRegressor | 0.055683 |
5 | 796115560103651146 | FINISHED | mlp2 | MLPRegressor | 0.060390 |