Temperature Forecasting: Statistical Approach

Temperature Forecasting: Statistical Approach#

Requirements#

import polars as pl 
import pandas as pd 
import numpy as np
import sys
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("whitegrid")
from statsmodels.tsa.seasonal import STL
import statsmodels.api as sm
from statsmodels.tsa.stattools import adfuller
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from datetime import timedelta
from sklearn.metrics import mean_absolute_error
import datetime
import warnings
warnings.filterwarnings("ignore")

The next classes and function has been developed for this project and are located in the python script PyTS.py, which has also been attached to the deliverable.

sys.path.insert(0, r'C:\Users\fscielzo\Documents\DataScience-GitHub\Time Series')
from PyTS import MakeLags, SARIMA, SimpleExpSmooth, LinearRegressionTS, KNeighborsRegressorTS, train_test_split_time_series, predictive_time_series_plot, KFold_score_time_series, KFold_time_series_plot, autoSARIMA

Data#

Conceptual description#

Jena Climate is weather time series dataset recorded at the Weather Station of the Max Planck Institute for Biogeochemistry in Jena, Germany.

This dataset is made up of 14 different quantities (such air temperature, atmospheric pressure, humidity, wind direction, and so on) were recorded every 10 minutes, over several years. This dataset covers data from January 1st 2009 to December 31st 2016.

The dataset can be found in Kaggle: https://www.kaggle.com/datasets/mnassrib/jena-climate

Variable Name	Description	Type
`Date Time`	Date-time reference	Date
`p (mbar)`	The pascal SI derived unit of pressure used to quantify internal pressure. Meteorological reports typically state atmospheric pressure in millibars.	quantitative
`T (degC)`	Temperature in Celsius	quantitative
`Tpot (K)`	Temperature in Kelvin	quantitative
`Tdew (degC)`	Temperature in Celsius relative to humidity. Dew Point is a measure of the absolute amount of water in the air, the DP is the temperature at which the air cannot hold all the moisture in it and water condenses.	quantitative
`rh (%)`	Relative Humidity is a measure of how saturated the air is with water vapor, the %RH determines the amount of water contained within collection objects.	quantitative
`VPmax (mbar)`	Saturation vapor pressure	quantitative
`VPact (mbar)`	Vapor pressure	quantitative
`VPdef (mbar)`	Vapor pressure deficit	quantitative
`sh (g/kg)`	Specific humidity	quantitative
`H2OC (mmol/mol)`	Water vapor concentration	quantitative
`rho (g/m ** 3)`	Airtight	quantitative
`wv (m/s)`	Wind speed	quantitative
`max. wv (m/s)`	Maximum wind speed	quantitative
`wd (deg)`	Wind direction in degrees	quantitative

Preprocessing the data#

The next peace of code read the data, rename their columns, change the date column to an appropriate date format, ad columns with the day, week, month, quarter and year of each observation and remove the last row which is the only point related with 2017.

climate_df = pl.read_csv('jena_climate_2009_2016.csv')

climate_df.columns = ['date', 'p', 'T', 'Tpot', 'Tdew', 'rh', 'VPmax', 'VPact', 'VPdef', 
                      'sh', 'H2OC', 'rho', 'wv', 'max_wv', 'wd']

climate_df = climate_df.with_columns(pl.col("date").str.to_date("%d.%m.%Y %H:%M:%S").name.keep())

climate_df = climate_df.with_columns(climate_df['date'].dt.day().alias('day'),
                        climate_df['date'].dt.month().alias('month'),
                        climate_df['date'].dt.year().alias('year'),
                        climate_df['date'].dt.week().alias('week'),
                        climate_df['date'].dt.quarter().alias('quarter'))

climate_df = climate_df[:-1,:] # removing last row, just because is the only data point regarding 2017

The data has 420550 rows and 20 columns.

climate_df.shape

(420550, 20)

We print a head and tail of the data.

climate_df.head()

shape: (5, 20)

date	p	T	Tpot	Tdew	rh	VPmax	VPact	VPdef	sh	H2OC	rho	wv	max_wv	wd	day	month	year	week	quarter
date	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	i8	i8	i32	i8	i8
2009-01-01	996.52	-8.02	265.4	-8.9	93.3	3.33	3.11	0.22	1.94	3.12	1307.75	1.03	1.75	152.3	1	1	2009	1	1
2009-01-01	996.57	-8.41	265.01	-9.28	93.4	3.23	3.02	0.21	1.89	3.03	1309.8	0.72	1.5	136.1	1	1	2009	1	1
2009-01-01	996.53	-8.51	264.91	-9.31	93.9	3.21	3.01	0.2	1.88	3.02	1310.24	0.19	0.63	171.6	1	1	2009	1	1
2009-01-01	996.51	-8.31	265.12	-9.07	94.2	3.26	3.07	0.19	1.92	3.08	1309.19	0.34	0.5	198.0	1	1	2009	1	1
2009-01-01	996.51	-8.27	265.15	-9.04	94.1	3.27	3.08	0.19	1.92	3.09	1309.0	0.32	0.63	214.3	1	1	2009	1	1

climate_df.tail()

shape: (5, 20)

date	p	T	Tpot	Tdew	rh	VPmax	VPact	VPdef	sh	H2OC	rho	wv	max_wv	wd	day	month	year	week	quarter
date	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	i8	i8	i32	i8	i8
2016-12-31	1000.11	-3.93	269.23	-8.09	72.6	4.56	3.31	1.25	2.06	3.31	1292.41	0.56	1.0	202.6	31	12	2016	52	4
2016-12-31	1000.07	-4.05	269.1	-8.13	73.1	4.52	3.3	1.22	2.06	3.3	1292.98	0.67	1.52	240.0	31	12	2016	52	4
2016-12-31	999.93	-3.35	269.81	-8.06	69.71	4.77	3.32	1.44	2.07	3.32	1289.44	1.14	1.92	234.3	31	12	2016	52	4
2016-12-31	999.82	-3.16	270.01	-8.21	67.91	4.84	3.28	1.55	2.05	3.28	1288.39	1.08	2.0	215.2	31	12	2016	52	4
2016-12-31	999.81	-4.23	268.94	-8.53	71.8	4.46	3.2	1.26	1.99	3.2	1293.56	1.49	2.16	225.8	31	12	2016	52	4

We make a fast descriptive summary of the data.

climate_df.describe()

shape: (9, 21)

describe	date	p	T	Tpot	Tdew	rh	VPmax	VPact	VPdef	sh	H2OC	rho	wv	max_wv	wd	day	month	year	week	quarter
str	str	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64
"count"	"420550"	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0	420550.0
"null_count"	"0"	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
"mean"	null	989.212751	9.450181	283.492779	4.955886	76.00826	13.576273	9.533771	4.042419	6.022418	9.640238	1216.062557	1.702225	3.056558	174.743714	15.713359	6.51732	2012.496802	26.617729	2.506375
"std"	null	8.358475	8.423346	8.504449	6.730651	16.476195	7.739016	4.184158	4.896855	2.656135	4.235388	39.975064	65.446792	69.017014	86.681794	8.799074	3.448315	2.289752	15.060659	1.116766
"min"	"2009-01-01"	913.6	-23.01	250.6	-25.01	12.95	0.95	0.79	0.0	0.5	0.8	1059.45	-9999.0	-9999.0	0.0	1.0	1.0	2009.0	1.0	1.0
"25%"	null	984.2	3.36	277.43	0.24	65.21	7.78	6.21	0.87	3.92	6.29	1187.49	0.99	1.76	124.9	8.0	4.0	2010.0	14.0	2.0
"50%"	null	989.58	9.42	283.47	5.22	79.3	11.82	8.86	2.19	5.59	8.96	1213.79	1.76	2.96	198.1	16.0	7.0	2012.0	27.0	3.0
"75%"	null	994.72	15.47	289.53	10.07	89.4	17.6	12.35	5.3	7.8	12.49	1242.77	2.86	4.74	234.1	23.0	10.0	2014.0	40.0	4.0
"max"	"2016-12-31"	1015.35	37.28	311.34	23.11	100.0	63.77	28.32	46.01	18.13	28.82	1393.54	28.49	23.5	360.0	31.0	12.0	2016.0	53.0	4.0

There is an anomaly in the variable wv, since the minimum value of it is -9999 when it should be a positive variable since is measure in m/s. We are going to clean this anomaly (error) substituting this value by the mean of the variable.

Naturally, this anomaly has been transmitted to max_wv, so, we will clean this variable as well.

climate_df = climate_df.with_columns(
                        pl.when(pl.col('wv') == pl.col('wv').min())
                        .then(pl.col('wv').mean())  # The replacement value for when the condition is True
                        .otherwise(pl.col('wv'))  # Keeps original value when condition is False
                        .alias('wv')  # Rename the resulting column back to 'variable'
                    )

climate_df = climate_df.with_columns(
                        pl.when(pl.col('max_wv') == pl.col('max_wv').min())
                        .then(pl.col('max_wv').mean())  # The replacement value for when the condition is True
                        .otherwise(pl.col('max_wv'))  # Keeps original value when condition is False
                        .alias('max_wv')  # Rename the resulting column back to 'variable'
                    )

checking if the last transformation has solved the anomaly completely.

climate_df.min()

shape: (1, 20)

date	p	T	Tpot	Tdew	rh	VPmax	VPact	VPdef	sh	H2OC	rho	wv	max_wv	wd	day	month	year	week	quarter
date	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	f64	i8	i8	i32	i8	i8
2009-01-01	913.6	-23.01	250.6	-25.01	12.95	0.95	0.79	0.0	0.5	0.8	1059.45	0.0	0.0	0.0	1	1	2009	1	1

We can see that there are non missing values in the data, which is specially important.

yearly temperature time series#

In this section we are going to compute the yearly time series of the temperature (T) grouping the data bu year and aggregating with the mean.

temp_year_df = climate_df.group_by(['year']).agg(pl.col('T').mean()).sort(['year']).with_columns(pl.col("year").cast(str))

temp_year_df

shape: (8, 2)

year	T
str	f64
"2009"	8.830284
"2010"	7.504652
"2011"	9.303914
"2012"	9.655535
"2013"	9.093978
"2014"	10.718917
"2015"	10.511207
"2016"	9.988671

Plotting#

We plot the this time series, first using a lineplot and secondly with a box plot for each year, overlapping the the mean for each one.

max_xtick = len(temp_year_df)
fig, ax = plt.subplots(figsize=(9,5))
p=sns.lineplot(x="year", y="T", data=temp_year_df , color='blue')
ax.set_ylabel('Avg.Temperature', size=12)
ax.set_xlabel('Year', size=12)
plt.xticks(fontsize=11, rotation=0)
plt.yticks(fontsize=11)
#ax.set_xticks(np.arange(0,max_xtick, 1))
plt.title("Yearly Avg. Temperature",  fontsize=14, weight='bold')
plt.tight_layout()
plt.show()

_images/603dbbdee0cc7dcf94d76926bb7792f13c60bbe32babb9793a94677c9c76a7b2.png

years = climate_df['year'].unique().to_numpy().flatten()
fig, axes = plt.subplots(figsize=(9,5.5))
ax = sns.boxplot(x="year", y="T", data=climate_df, showfliers=True, color='orange')
ax = sns.lineplot(x=range(0,len(years)), y=temp_year_df['T'], color='blue', label='mean')
ax.set_ylabel('Avg.Temperature', size=12)
ax.set_xlabel('Month', size=12)
plt.xticks(fontsize=11, rotation=0)
plt.yticks(fontsize=11)
plt.title(f"Yearly Avg.Temperature", fontsize=14, weight='bold')
plt.tight_layout()
plt.show()

_images/3bd83d0913c013f59ed072e2439c655eee3778eaa8bc8e6eb1343108bcc5fd89.png

Monthly temperature time series#

Now we compute the monthly temperature time series, grouping the temperature by year-month and aggregating by the mean.

temp_month_df = climate_df.group_by(['year', 'month']).agg(pl.col('T').mean()).sort(['year', 'month'])
temp_month_df = temp_month_df.with_columns((pl.col("month").cast(str) + '-' + pl.col("year").cast(str)).alias("month-year"))

temp_month_df.head()

shape: (5, 4)

year	month	T	month-year
i32	i8	f64	str
2009	1	-3.626617	"1-2009"
2009	2	0.16995	"2-2009"
2009	3	3.989944	"3-2009"
2009	4	11.889757	"4-2009"
2009	5	13.433905	"5-2009"

Plotting#

We plot the this time series, first using a lineplot and secondly with a box plot for each month, overlapping the the mean for each one.

max_xtick = len(temp_month_df)
fig, ax = plt.subplots(figsize=(10,5))
p=sns.lineplot(x="month-year", y="T", data=temp_month_df , color='blue')
ax.set_ylabel('Avg.Temperature', size=12)
ax.set_xlabel('Month-Year', size=12)
plt.xticks(fontsize=10.5, rotation=0)
plt.yticks(fontsize=11)
ax.set_xticks(np.linspace(0, max_xtick, 12))
plt.title("Monthly Avg.Temperature", fontsize=15, weight='bold')
plt.tight_layout()
plt.show()

_images/379e2a61df0ac0cd8f0ea6d876f9a2789c3c2e8f8ae11460d0931f1ed338266f.png

years = climate_df['year'].unique().to_numpy().flatten()
n_columns = 2
n_rows = int(np.ceil(len(years) / n_columns))
min_temp = climate_df['T'].min()
max_temp = climate_df['T'].max()
fig, axes = plt.subplots(n_rows, n_columns, figsize=(15, 13))
axes = axes.flatten()  
for i, year in enumerate(years):
    ax = sns.boxplot(x="month", y="T", data=climate_df.filter(pl.col('year') == year), 
                    showfliers=True, color='orange', ax=axes[i])
    ax = sns.lineplot(x=range(0,12), y=temp_month_df.filter(pl.col('year') == year)['T'], 
                    color='blue', ax=axes[i], label=f'Mean {year}')
    axes[i].legend(fontsize=8.5)
    axes[i].set_title(f'Year {year}', fontsize=12)
    axes[i].set_xlabel('Month', fontsize=11)
    axes[i].set_ylabel('Avg.Temperature', fontsize=11)
    axes[i].set_yticks(np.linspace(min_temp, max_temp, 7))
    plt.suptitle('Monthly Avg.Temperature by Year', 
                fontsize=15, y=0.94, weight='bold', color='black', alpha=1)
# Remove any unused subplots in case the number of 'geo' values is less than num_rows * num_cols
for j in range(len(years), n_rows * n_columns):
    fig.delaxes(axes[j])
plt.subplots_adjust(hspace=0.5, wspace=0.2) 
plt.show()

_images/f9a2ddfa08d20f97357ae95d805cee747b7a3409bd4c17ebd5e88dae5e7850f4.png

Daily time series#

In this section we are going to compute the daily time series for the temperature (T), humidity (rh) and wind speed (wv), which are the time series we are interested in forecasting.

df = {}
variables_to_forecast = ['T', 'rh', 'wv']
for col in variables_to_forecast:
    df[col] = climate_df.group_by(['year', 'month', 'day']).agg(pl.col(col).mean()).sort(['year', 'month', 'day'])
    df[col] = df[col].with_columns((pl.col("day").cast(str) + '-' + pl.col("month").cast(str) + '-' + pl.col("year").cast(str)).alias("date"))
    df[col] = df[col].with_columns(pl.col("date").str.to_date("%d-%m-%Y").name.keep())

for col in variables_to_forecast:
    display(df[col].head())

shape: (5, 5)

year	month	day	T	date
i32	i8	i8	f64	date
2009	1	1	-6.810629	2009-01-01
2009	1	2	-3.728194	2009-01-02
2009	1	3	-5.271736	2009-01-03
2009	1	4	-1.375208	2009-01-04
2009	1	5	-4.867153	2009-01-05

shape: (5, 5)

year	month	day	rh	date
i32	i8	i8	f64	date
2009	1	1	91.086014	2009-01-01
2009	1	2	92.086806	2009-01-02
2009	1	3	76.458056	2009-01-03
2009	1	4	89.417361	2009-01-04
2009	1	5	86.260417	2009-01-05

shape: (5, 5)

year	month	day	wv	date
i32	i8	i8	f64	date
2009	1	1	0.778601	2009-01-01
2009	1	2	1.419514	2009-01-02
2009	1	3	1.250903	2009-01-03
2009	1	4	1.720417	2009-01-04
2009	1	5	3.800278	2009-01-05

Plotting#

We plot the three time series using a lineplot.

colors = ['blue', 'green', 'orange']
for i, col in enumerate(variables_to_forecast):

    max_xtick = len(df[col])
    fig, ax = plt.subplots(figsize=(10,5))
    p=sns.lineplot(x="date", y=col, data=df[col] , color=colors[i])
    ax.set_ylabel(f'Avg. {col}', size=12)
    ax.set_xlabel('Date', size=12)
    plt.xticks(fontsize=11, rotation=0)
    plt.yticks(fontsize=11)
    #ax.set_xticks(np.linspace(0, max_xtick, 12))
    plt.title(f"Daily Avg. {col}", fontsize=14, weight='bold')
    plt.tight_layout()
    plt.show()

_images/48d6c9910114bc9b3559f842e7f943018c9af825db297bcffa53bdfcb2871526.png

_images/12f601d7a676e893044fb512086e51a4781988b34666a3a875d83bbbfb51296b.png

_images/cae52814e9b32c551688790e0fdb5416991a231e7250033b538be606c069ccc3.png

We can plot boxplots for the daily temperature in a given month of a given year, in this case we will plot january, july and october of 2016 (the last year available in the data).

Without the same y-axes scale for each row

year = 2016
colors = ['turquoise', 'limegreen', 'orange']

for i, col in enumerate(variables_to_forecast):

    fig, axes = plt.subplots(1,3, figsize=(20,5))
    axes = axes.flatten()

    for r, month in enumerate([1, 7, 10]):

        n_months = len(df[col].filter(pl.col('year') == year, pl.col('month') == month))
        ax = sns.boxplot(x="day", y=col, data=climate_df.filter(pl.col('year') == year, pl.col('month') == month), showfliers=True, color=colors[i], ax=axes[r])
        ax = sns.lineplot(x=range(0,n_months), y=df[col].filter(pl.col('year') == year, pl.col('month') == month)[col], color='red', label='Mean', ax=axes[r])
        axes[r].set_xlabel('Day', size=12)
        axes[r].set_ylabel('', size=13)
        axes[r].set_title(f'Month - {month}', fontsize=12)
        axes[r].tick_params(axis='x', rotation=0, labelsize=10)
    axes[0].set_ylabel(f'Avg. {col}', size=13)

    plt.suptitle(f"Daily Avg. {col} - Year {year}",  fontsize = 14, weight='bold')
    plt.tight_layout()
    plt.show()

_images/5486f0b73df7b1cd057b78ff0f1865b52ea553d8a3104c32c2bf5b368490b60c.png

_images/f0a2091347c8bc9e3f8a3d96e9d4d6f2172df7d3edcdfced00912c5da969b7c2.png

_images/837ce07be7429e3de0dcb7f3b6b70b2eed15b186358013b28d2b1dea7a602a25.png

Using the same y-axes scale for each row

yticks = [np.arange(-15,37, 5), np.arange(45,115, 10), np.arange(0,15, 3)]

for i, col in enumerate(variables_to_forecast):

    fig, axes = plt.subplots(1,3, figsize=(20,5))
    axes = axes.flatten()

    for r, month in enumerate([1, 7, 10]):

        n_months = len(df[col].filter(pl.col('year') == year, pl.col('month') == month))
        ax = sns.boxplot(x="day", y=col, data=climate_df.filter(pl.col('year') == year, pl.col('month') == month), showfliers=True, color=colors[i], ax=axes[r])
        ax = sns.lineplot(x=range(0,n_months), y=df[col].filter(pl.col('year') == year, pl.col('month') == month)[col], color='red', label='Mean', ax=axes[r])
        axes[r].set_xlabel('Day', size=12)
        axes[r].set_ylabel('', size=13)
        axes[r].set_title(f'Month {month}', fontsize=12)
        axes[r].tick_params(axis='x', rotation=0, labelsize=10)
        axes[r].set_yticks(yticks[i])
    axes[0].set_ylabel(f'Avg. {col}', size=13)

    plt.suptitle(f"Daily Avg. {col} - Year {year}",  fontsize = 15, weight='bold')
    plt.tight_layout()
    plt.show()

_images/7fb8be82553eb9d9861b5d6cbfeff7320ad3e20f2a73b20d8450a5b4efed904a.png

_images/73065268dd607e6fc5aa41e8822c32c7e91dc21e3f9a6d654753596f895da9fa.png

_images/78eb6919656fcacc48e0746f3ca3bba761cfc617f1ae6d9bad9f269a792b47f5.png

Forecasting Daily Temperature, Humidity and Wind Speed#

In this section we are going to look for the best models for forecasting the daily temperature, humidity and wind speed, following an statistical approach, this means that we are not going to use more Machine Learning related tools like full automatic grid search algorithms and other ML algorithms (this will be done in the second project), and we will be also specially focus in SARIMA models, and will characterize them based on statistical techniques such as time series decomposition, correlation plots and the Dickey Fuller test.

It’s important to notice that our main goal is to find three models for forecasting the temperature, humidity and wind speed 15 days in advanced.

We say three models because we are going to loo for the best model for forecasting each variable, but the same model could be the best for several of the variables.

Anyway, the main aim of the project is look for a good model but, as we commented before, without using exhaustive ML search algorithms, because this will be done in the next project. this project is more manual and statistical oriented.

Response and Predictors (lags)#

In time series we can use the lags of the response variable to predict it, namely, for a given date, we can use the past values of the response with respect of that date to predict it for that specific date. These past values are called lags, and the number of lags considered is a really important hyper-parameter in time series.

So, in time series the lags of the response play the role of predictors, even though they could not be the only predictors since other external variables could be used in the forecasting process, this is what we called multivariate time series, and it will be addressed in the second project.

The nex code define the response for each problem (temperature, humidity and wind speed forecasting), then, a fake predictors matrix is create for our statsmodels based implementations (them require that, but is only a technical detail related with how they are been characterize from a programming perspective). Afert that the code define the predictors and response for our sklearnbased implementations, which must incorporate the lags. In this case we have add a grid of lags, because, as we said before, they are an important hyper-parameter.

It’s important to notice that statsmodels based implementations only need the original response, without lags, and a predictors matrix is not required, beyond the fake one, that is, in fact, a zero matrix, which is not used by the model at all.

Y, X_st = {}, {}
X_sk, Y_sk, make_lags = {col: {} for col in variables_to_forecast}, {col: {} for col in variables_to_forecast}, {col: {} for col in variables_to_forecast}
lags_grid = [1, 2, 3, 7, 10, 20, 30, 40]

for col in variables_to_forecast:

    Y[col] = df[col][col].to_numpy()

    # Fake X for statsmodels based implementations
    X_st[col] = np.zeros((len(Y[col]), 4))
                         
    # Lagged X and Y for sklearn based implementations
    for lag in lags_grid:
        make_lags[col][lag] = MakeLags(n_lags=lag, ascending=True)
        make_lags[col][lag].fit()
        X_sk[col][lag], Y_sk[col][lag] = make_lags[col][lag].transform(y=Y[col])

For a better understanding we are going to display the 3-lagged response and predictors, for the temperature, gathered in a single data-frame:

make_lags['T'][3].y_lags_df.head(7)

shape: (7, 4)

Y	Y_lag_1	Y_lag_2	Y_lag_3
f64	f64	f64	f64
-6.810629	NaN	NaN	NaN
-3.728194	-6.810629	NaN	NaN
-5.271736	-3.728194	-6.810629	NaN
-1.375208	-5.271736	-3.728194	-6.810629
-4.867153	-1.375208	-5.271736	-3.728194
-15.482847	-4.867153	-1.375208	-5.271736
-15.734375	-15.482847	-4.867153	-1.375208

The rows with NaN will not be used for training the models. The fourth row is the fist without NaN, the first column contains the original values of the response, the other columns contain the lagged response in different orders. Specifically, the second column contains the response lagged 1 period (day), the third lagged 2 days and the fourth lagged 3 days.

So, given a row without NaN, the value of the first column represents the actual value of the response (temperature in this case) in a given day, and the values of the resting columns represent the temperature in the previous three days (since there are 3 lags).

The idea is to use models that learn the relationship between the temperature in a given day an the one in the previous three days, and this for all the available days in a given training set.

Y_sk['T'][3]

array([ -1.37520833,  -4.86715278, -15.48284722, ...,   2.67625   ,
        -1.70659722,  -2.4925    ])

X_sk['T'][3]

array([[-5.27173611, -3.72819444, -6.81062937],
       [-1.37520833, -5.27173611, -3.72819444],
       [-4.86715278, -1.37520833, -5.27173611],
       ...,
       [ 4.88715278,  5.245625  ,  7.52743056],
       [ 2.67625   ,  4.88715278,  5.245625  ],
       [-1.70659722,  2.67625   ,  4.88715278]])

Characterizing SARIMA models#

Time series decomposition#

decomposition = {}
for col in variables_to_forecast:
    decomposition[col] = STL(Y[col], period=7).fit()

for col in variables_to_forecast:

    fig, axs = plt.subplots(nrows=4, ncols=1, sharex=True, figsize=(15,8))
    p1=sns.lineplot(decomposition[col].observed, color='red', ax=axs[0], label='observed')
    p2=sns.lineplot(decomposition[col].trend, color='blue', ax=axs[1], label='trend')
    p3=sns.lineplot(decomposition[col].seasonal, color='green', ax=axs[2], label='seasonality')
    p4=sns.lineplot(decomposition[col].resid, color='purple', ax=axs[3], label='residuals')
    p1.set_ylabel('Observed')
    p2.set_ylabel('Trend')
    p3.set_ylabel('Seasonality')
    p4.set_ylabel('Residuals')
    plt.suptitle(f'STL Decomposition - Daily Avg. {col}', 
                fontsize=17, y=0.99, weight='bold', color='black')
    plt.tight_layout()

    # Remove individual legends created by seaborn lineplot
    for ax in axs:
        ax.get_legend().remove()

    # Create a common legend for the figure
    handles, labels = [], []
    for ax in axs:
        for handle, label in zip(*ax.get_legend_handles_labels()):
            handles.append(handle)
            labels.append(label)

    fig.legend(handles, labels, loc='upper right', bbox_to_anchor=(1.1, 0.6), fontsize=12)

    plt.show()

_images/95f62569f5d43fb1dd6358efb944e0e5bc95377e877889142691b510f1056916.png

_images/077e57c7d8599b10c15c40c176446821b3699bb45edcc3bc33137170af440fc9.png

_images/ed3445b4a9950a58e21267751528a30c49559245142f997f01c05a86ceb55175.png

According to these plots and the ones showed before, we can extract some information regarding the trend and seasonality components of our three series in discussion.

Temperature (T):
- Trend component: it seems that the daily temperature has non trend, since the values oscillate around a same point along the series, in other words, it has an approximately constant mean along the time. This fact tell us that a difference in the regular part will not be necessary, but, despite the plots, the will carry out an statistical test for checking this point later, the (augmented) Dickey Fuller test.
- Seasonal component: the series has a clear seasonality, which is clearly shown in the plots, and, moreover, we know that temperature is directly related to the seasons, so in this case there is no doubt. So, it seems that the series needs a difference en the seasonal part, to make it stationary in that component. The seasonality period seems to be 365 days, since the seasonal pattern seems to be repeated each 365 days.
Humidity (rh):
- Trend component: it seems that the daily relative humidity has non trend, since the values oscillate around a same point along the series, in other words, it has an approximately constant mean along the time. This fact tell us that a difference in the regular part would not be necessary.
- Seasonal component: the series has a clear seasonality (although not as clear as in the temperature case), which is shown in the plots. So, it seems that the series needs a difference en the seasonal part, to make it stationary in that component. The seasonality period seems to be 365 days, since the seasonal pattern seems to be repeated each 365 days.
Wind speed (wv):
- Trend component: it seems that the daily wind speed has non trend, since the values oscillate around a same point along the series, in other words, it has an approximately constant mean along the time. This fact tell us that a difference in the regular part would not be necessary.
- Seasonal component: the series has a non clear seasonality, so it seems that the series doesn’t need a difference in the seasonal part. Moreover, the seasonal part of the SARIMA model wouldn’t be necessary.

So according to the previous analysis, the suggested difference parameters for a SARIMA model would be:

Temperature (T):
- \(d = 0\quad\) (regular difference)
- \(D=1, 2\quad\) (seasonal difference)
- \(s=365\)
Humidity (rh):
- \(d = 0\)
- \(D=1, 2\)
- \(s=365\)
Wind speed (wv):
- \(d=0\)
- \(D=0\)
- \(s=0\)
- \(P,Q=0\)

As we will see later, \(s=365\) involves a lot of computational time with our data and the current implementation of the SARIMA model by statsmodels. Due to that we will have to set a much lower different values, just to train the models in a reasonable time, even though we known that those values of \(s\) will not be the most suitable at all from a theoretical perspective, but are much more efficient computationally.

Auto-correlation Plots#

Regular part \((p,q)\)#

colors = ['blue', 'green', 'orange']

for i, col in enumerate(variables_to_forecast):

    fig, ax = plt.subplots(2, 1, figsize=(8, 5))
    plot_acf(Y[col], lags=60, ax=ax[0], color=colors[i])
    plot_pacf(Y[col], lags=60, ax=ax[1], color=colors[i])
    ax[0].set_ylim(-0.5, 1.2) 
    ax[1].set_ylim(-0.5, 1.2) 
    plt.suptitle(f'Daily Avg. {col} - Correlation Plots', 
                fontsize=12, y=0.99, weight='bold', color='black')
    plt.tight_layout()
    plt.show()

_images/de5553c87e257e6f4ddbd603af5a9b9a9cb65ecbc00fb6c74c02f0d9a3c5b88e.png

_images/68714e7daabe3c529e675b6528660f97ea0f8a79ba956599816dbd961d62399d.png

_images/289a052d1975dbedef8f302d22d958a6add52cf4a16b6a57613ce5e9316d0bbf.png

According to these correlation plots we can guess the parameters \(p\) and \(q\) of the SARIMA model.

Temperature (T):
- ACP: the ACP shows a clear (maybe geometric) descending structure, which is common in a AR (auto-regressive) process.
- PACP: the PACP shows that the first two partial-auto-correlations are non-zero significative while the rest not.
- Conclusion: tis series fits with an AR process of order two \(\Rightarrow p=2\) , \(q=0\).
Humidity (rh):
- ACP: the ACP shows a clear (geometric) descending structure, which is common in a AR (auto-regressive) process.
- PACP: the PACP shows that the first two partial auto correlations are non-zero significative while the rest not. But the second value is less significant than in the temperature series.
- Conclusion: tis series fits with an AR process of order two, but it may also be of order one \(\Rightarrow p=1,2\) , \(q=0\).
Wind speed (wv):
- ACP: the ACP shows a very fast (geometric) descending structure, which is common in AR processes.
- PACP: the PACP shows that the first partial auto correlations is non-zero significative while the rest not (although the second one could be significative as well).
- Conclusion: tis series fits with an AR process of order two, but it may also be of order one \(\Rightarrow p=1,2\) , \(q=0\).

Seasonal part \((P,Q)\)#

We are going to apply a seasonal difference to the variables which show a seasonal component, namely, temperature and humidity, as we discussed before.

After that, we will plot the auto correlation plots for both, in order to identify or guess values for the parameters \(P\) and \(Q\), the order of the AR and MA processes in the seasonal part of the SARIMA model.

Y_seasonal_diff = {}
variables_with_seasonality = ['T', 'rh']
for col in variables_with_seasonality:
    Y_seasonal_diff[col] = pd.DataFrame(Y[col]).diff(periods=365).dropna().to_numpy()

for i, col in enumerate(variables_with_seasonality):

    fig, ax = plt.subplots(2, 1, figsize=(8, 5))
    plot_acf(Y_seasonal_diff[col], lags=40, ax=ax[0], color=colors[i])
    plot_pacf(Y_seasonal_diff[col], lags=40, ax=ax[1], color=colors[i])
    ax[0].set_ylim(-0.5, 1.2) 
    ax[1].set_ylim(-0.5, 1.2) 
    plt.suptitle(f'Daily Avg. {col} - Correlation Plots', 
                fontsize=12, y=0.99, weight='bold', color='black')
    plt.tight_layout()
    plt.show()

_images/acb798f12b01fdd3557073615dc1fb9bfcd557f05d3a45ce9d5ead0f9d90ceea.png

_images/dcf05b1c312ac7510e90c8f740d55e7ad764adbb6fbc31dd38f6bab7e0c5f5d9.png

According to these correlation plots we can guess the parameters \(P\) and \(Q\) of the SARIMA model.

Temperature (T):
- ACP: the ACP shows a clear (maybe geometric) descending structure, which is common in a AR (auto-regressive) process.
- PACP: the PACP shows that the first two partial-auto-correlations are non-zero significative while the rest not, although it could be interpreted as well that only the first one is clearly significative.
- Conclusion: this series fits with an AR process of order one or two in the seasonal part \(\Rightarrow P=1,2\) , \(Q=0\).
Humidity (rh):
- ACP: the ACP shows a clear (geometric) descending structure, which is common in a AR (auto-regressive) process.
- PACP: the PACP shows that the first partial auto correlation is non-zero significative while the rest not. Although the second one could be considered as significative as well.
- Conclusion: this series fits with an AR process of order one or two \(\Rightarrow P=1,2\) , \(Q=0\).
Wind speed (wv):
- As we saw before, the daily wind speed has not clear seasonality, so that \(P=Q=0\).

Augmented Dickey Fuller Test#

In this section we are going to carry out the augmented dickey fuller test for the three original series. This test has as null hypothesis \((H_0)\) that the series has non significative trend, so, is likely stationary in the regular part, and as alternative hypothesis the opposite. So, reject \(H_0\) means that our series has significative trend and, therefore, needs (at least) one regular difference. And not reject \(H_0\) means that our series is stationary in mean, that is, it doesn’t show a clear trend, so, a difference in the regular part will not be needed.

print('------------------------------------------------------------------------------------------------------------------------------------------------')

for col in variables_to_forecast:

    result = adfuller(Y[col])
    test_statistic = np.round(result[0],3)
    p_value = np.round(result[1],3)

    print(f'Test Statistic the time series {col}: {test_statistic}')
    print(f'p-value for the time series {col}: {p_value}')

    alpha = 0.05
    if p_value < alpha:
        print(f'Reject the H0 --> The time series {col} seems stationary (non significative trend) --> The series doesn\'t need a regular difference')
    else:
        print(f'Not reject H0 --> The time series {col} seems non-stationary (significative trend)  --> The series needs a regular difference')

    print('------------------------------------------------------------------------------------------------------------------------------------------------')

------------------------------------------------------------------------------------------------------------------------------------------------
Test Statistic the time series T: -3.614
p-value for the time series T: 0.005
Reject the H0 --> The time series T seems stationary (non significative trend) --> The series doesn't need a regular difference
------------------------------------------------------------------------------------------------------------------------------------------------

Test Statistic the time series rh: -5.619
p-value for the time series rh: 0.0
Reject the H0 --> The time series rh seems stationary (non significative trend) --> The series doesn't need a regular difference
------------------------------------------------------------------------------------------------------------------------------------------------
Test Statistic the time series wv: -14.456
p-value for the time series wv: 0.0
Reject the H0 --> The time series wv seems stationary (non significative trend) --> The series doesn't need a regular difference
------------------------------------------------------------------------------------------------------------------------------------------------

As we deduced before, the three series has non significative trend, so a regular difference is not needed for them \((d=0)\).

Final specification#

So, summarizing these are the suggested SARIMA models for each time series, according to the previous analysis:

Temperature (T): \(\quad(p=2, d=0, q=0)\times (P=1,2 , D=1,2, Q=0)_{s=365}\)
Humidity (rh): \(\quad (p=1,2, d=0, q=0)\times (P=1,2 , D=1,2, Q=0)_{s=365}\)
Wind speed (wv): \(\quad (p=1,2, d=0, q=0)\times (P=0 , D=0, Q=0)_{s=0}\)

Specifying the models#

In this section we are going to define the models taht we are going to use.

In one hand the ones based on statsmodels implementations, essentially SARIMA and exponential smoothing. On the other hand the ones based on sklearnimplementations, basically Linear Regression and KNN.

In this project we have decided not to go further in the sklearn models just because we want to take much more advantage of them in the next project, since are more related with the Machine Learning approach that will inspire that second project.

statsmodels based implementations:

Here we specify several SARIMA models based on the above section analysis.

sarima = {col : {} for col in variables_to_forecast}
exp_smooth = {}

As we said before, we are not going to set \(s=365\) due to it is computational too expensive, but it would be the best option, at least in theory. However we are forced to set lower values to run the models in a reasonable time.

SARIMA specification for Temperature:

sarima['T'][1] = SARIMA(p=2, d=0, q=0, P=1, D=1, Q=0, s=7)  
sarima['T'][2] = SARIMA(p=2, d=0, q=0, P=2, D=1, Q=0, s=7)  
sarima['T'][3] = SARIMA(p=2, d=0, q=0, P=2, D=2, Q=0, s=7)  
sarima['T'][4] = SARIMA(p=2, d=0, q=0, P=1, D=2, Q=0, s=7)  
sarima['T'][5] = SARIMA(p=2, d=0, q=0, P=1, D=1, Q=0, s=14)  
sarima['T'][6] = SARIMA(p=2, d=0, q=0, P=1, D=1, Q=0, s=30)  

SARIMA specification for Humidity:

sarima['rh'][1] = SARIMA(p=1, d=0, q=0, P=1, D=1, Q=0, s=7)  
sarima['rh'][2] = SARIMA(p=2, d=0, q=0, P=1, D=1, Q=0, s=7)  
sarima['rh'][3] = SARIMA(p=1, d=0, q=0, P=2, D=1, Q=0, s=7)  
sarima['rh'][4] = SARIMA(p=1, d=0, q=0, P=2, D=2, Q=0, s=7)  
sarima['rh'][5] = SARIMA(p=2, d=0, q=0, P=2, D=1, Q=0, s=7)  
sarima['rh'][6] = SARIMA(p=2, d=0, q=0, P=2, D=2, Q=0, s=7)  
sarima['rh'][7] = SARIMA(p=1, d=0, q=0, P=1, D=1, Q=0, s=14)  
sarima['rh'][8] = SARIMA(p=2, d=0, q=0, P=1, D=1, Q=0, s=14)  

SARIMA specification for Wind Speed:

sarima['wv'][1] = SARIMA(p=1, d=0, q=0, P=0, D=0, Q=0, s=0)  
sarima['wv'][2] = SARIMA(p=2, d=0, q=0, P=0, D=0, Q=0, s=0)  
sarima['wv'][3] = SARIMA(p=1, d=0, q=1, P=0, D=0, Q=0, s=0)  
sarima['wv'][4] = SARIMA(p=1, d=0, q=2, P=0, D=0, Q=0, s=0)  
sarima['wv'][5] = SARIMA(p=2, d=0, q=1, P=0, D=0, Q=0, s=0)  
sarima['wv'][6] = SARIMA(p=2, d=0, q=2, P=0, D=0, Q=0, s=0)  

Simple Exponential Smoothing specification: (the same ones for each variable)

exp_smooth[1] = SimpleExpSmooth(smoothing_level=0.05)
exp_smooth[2] = SimpleExpSmooth(smoothing_level=0.5)
exp_smooth[3] = SimpleExpSmooth(smoothing_level=0.8)
exp_smooth[4] = SimpleExpSmooth(smoothing_level=1.5)

We define a dictionary with the models specified for each variable.

st_models = {}

for col in variables_to_forecast:

    st_models_values =  [sarima[col][i] for i in sarima[col].keys()] + [exp_smooth[i] for i in exp_smooth.keys()]                 

    st_model_keys = [str(sarima[col][i]) for i in sarima[col].keys()] + [str(exp_smooth[i]) for i in exp_smooth.keys()]

    st_models[col] = dict(zip(st_model_keys, st_models_values))

sklearn based implementations:

We define tha sklear based models that we will try in this project (in the next much more sklearn implementations will be tried).

linear_regression = LinearRegressionTS()
knn = KNeighborsRegressorTS()

As before, we define a dictionary with the sklearn models.

sk_models_values = [linear_regression, knn]

sk_models_keys = ['Linear Regression', 'knn']

sk_models = dict(zip(sk_models_keys, sk_models_values))

Time Series Train-Test split#

We are going to use train-test split for the outer evaluation, that is, for the estimation of future performance.

First we fix both the test and forecast window, that is, the number of days that will define the testing and forecasting period. We have decided to fix them equally.

test_window = 15
forecast_window = 15

For the statmodels implementations:

We define the train-test split, one for each variable.

X_train_st, X_test_st, Y_train_st, Y_test_st = {}, {}, {}, {}

for col in variables_to_forecast:
    
    X_train_st[col], X_test_st[col], Y_train_st[col], Y_test_st[col] = train_test_split_time_series(X=X_st[col], y=Y[col], test_window=test_window)

For the sklearn implementations:

We define the train-test split, one for each variable and lag considered.

X_train_sk, X_test_sk, Y_train_sk, Y_test_sk = {col: {} for col in variables_to_forecast}, {col: {} for col in variables_to_forecast}, {col: {} for col in variables_to_forecast}, {col: {} for col in variables_to_forecast}

for col in variables_to_forecast:

    for lag in lags_grid:

        X_train_sk[col][lag], X_test_sk[col][lag], Y_train_sk[col][lag], Y_test_sk[col][lag] = train_test_split_time_series(X=X_sk[col][lag], y=Y_sk[col][lag], test_window=test_window)

Temperature (`T`)#

Simple Validation#

In this section we are going to use simple validation as inner evaluation method.

Y_hat, Y_test_hat, Y_future_hat, scores = {}, {}, {}, {}

For statmodels based implementations

We applied simple validation to the statsmodels models for computing their inner score (error) in the validation set, as well as we use them to forecast the future (the next days, starting form the last date in our available date), because we will to show this info in a plot as well.

# Train-train - Train-validate split
X_train2, X_val, Y_train2, Y_val = train_test_split_time_series(X_train_st['T'], Y_train_st['T'], test_window=test_window)

for name, model in zip(st_models['T'].keys(), st_models['T'].values()):
    print(name)

    # Forecasting the past 
    model.fit(y=Y_train2)
    Y_test_hat[name] = model.forecast(window=test_window) 
    scores[name] = mean_absolute_error(y_pred=Y_test_hat[name], y_true=Y_val)
    
    # Forecasting the future
    model.fit(y=Y['T'])
    Y_future_hat[name] = model.forecast(window=forecast_window)   

    Y_hat[name] = np.concatenate((Y_test_hat[name], Y_future_hat[name]))

SARIMA(D=1, P=1, p=2, s=7)
SARIMA(D=1, P=2, p=2, s=7)
SARIMA(D=2, P=2, p=2, s=7)

SARIMA(D=2, P=1, p=2, s=7)
SARIMA(D=1, P=1, p=2, s=14)
SARIMA(D=1, P=1, p=2, s=30)
SimpleExpSmooth(smoothing_level=0.05)
SimpleExpSmooth(smoothing_level=0.5)
SimpleExpSmooth(smoothing_level=0.8)
SimpleExpSmooth(smoothing_level=1.5)

For sklearn based implementations

We applied simple validation to the sklearn models to compute their inner score (error) in the validation set, as well as we use them to forecast the future (the next days, starting form the last date in our available date), because we will to show this info in a plot as well.

In this case we train each model with each one of the considered lags, since sklearn implementations takes the lags of the response as predictors, and the lags play an important role in the forecasting process, so, is an important hyperparameter, therefore is important to explore different lags. In this project we are not going to do this exploration in the most suitable way, since we are not applying a standard grid search algorithm, but we will try at least several values, just to see that they are really import for the final forecasting results.

for lag in lags_grid:
    print(lag)

    # Train-train - Train-validate split
    X_train2, X_val, Y_train2, Y_val = train_test_split_time_series(X=X_train_sk['T'][lag], y=Y_train_sk['T'][lag], test_window=test_window)

    for name, model in zip(sk_models.keys(), sk_models.values()):
        print(name)

        # Forecasting the past 
        model.fit(X=X_train2, y=Y_train2)
        Y_test_hat[name + f' (lag={lag})'] = model.forecast(window=test_window) 
        scores[name + f' (lag={lag})'] = mean_absolute_error(y_pred=Y_test_hat[name + f' (lag={lag})'], y_true=Y_val)
        
        # Forecasting the future
        model.fit(X=X_sk['T'][lag], y=Y_sk['T'][lag])
        Y_future_hat[name + f' (lag={lag})'] = model.forecast(window=forecast_window)   

        Y_hat[name + f' (lag={lag})'] = np.concatenate((Y_test_hat[name + f' (lag={lag})'], Y_future_hat[name + f' (lag={lag})']))

1
Linear Regression
knn
2
Linear Regression
knn
3
Linear Regression
knn
7
Linear Regression
knn
10
Linear Regression
knn
20
Linear Regression
knn
30
Linear Regression
knn
40
Linear Regression
knn

For pmdarimaimplementation

Here we apply simple validation to the auto SARIMA implementation of the library pmdarima. This implementation applies a sort of grid search on the SARIMA model.

name = 'Auto SARIMA'

auto_sarima = autoSARIMA(seasonal=True, m=7, d=0, D=1, start_p=0, start_q=0, max_p=3, max_q=3,
                         suppress_warnings=True, stepwise=True, trace=True)

# Train-train - Train-validate split
X_train2, X_val, Y_train2, Y_val = train_test_split_time_series(X_train_st['T'], Y_train_st['T'], test_window=test_window)

# Forecasting the past 
auto_sarima.fit(y=Y_train2)
Y_test_hat[name] = auto_sarima.forecast(window=test_window) 
scores[name] = mean_absolute_error(y_pred=Y_test_hat[name], y_true=Y_val)

# Forecasting the future
auto_sarima.fit(y=Y['T'])
Y_future_hat[name] = auto_sarima.forecast(window=forecast_window)   

Y_hat[name] = np.concatenate((Y_test_hat[name], Y_future_hat[name]))

Performing stepwise search to minimize aic

 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=17027.082, Time=0.83 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=17617.318, Time=0.05 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=14221.414, Time=0.66 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=14894.322, Time=1.00 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=17615.365, Time=0.04 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=14936.976, Time=0.12 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=13830.799, Time=1.75 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=7.60 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=2.83 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=17009.228, Time=1.10 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=13723.824, Time=2.40 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=14082.336, Time=0.93 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=9.43 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=4.00 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=13708.734, Time=3.05 sec
 ARIMA(3,0,0)(1,1,0)[7] intercept   : AIC=14076.852, Time=1.13 sec
 ARIMA(3,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=10.88 sec
 ARIMA(3,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=4.96 sec
 ARIMA(3,0,1)(2,1,0)[7] intercept   : AIC=13710.524, Time=5.82 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=13710.062, Time=3.63 sec
 ARIMA(3,0,0)(2,1,0)[7]             : AIC=13706.776, Time=0.90 sec
 ARIMA(3,0,0)(1,1,0)[7]             : AIC=14074.882, Time=0.50 sec
 ARIMA(3,0,0)(2,1,1)[7]             : AIC=inf, Time=5.23 sec
 ARIMA(3,0,0)(1,1,1)[7]             : AIC=inf, Time=2.30 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=13721.873, Time=0.74 sec
 ARIMA(3,0,1)(2,1,0)[7]             : AIC=13708.565, Time=1.68 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=13708.107, Time=1.11 sec

Best model:  ARIMA(3,0,0)(2,1,0)[7]          
Total fit time: 74.689 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=17203.779, Time=0.72 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=17805.774, Time=0.05 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=14366.839, Time=0.63 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=15045.405, Time=1.00 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=17803.852, Time=0.03 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=15095.058, Time=0.12 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=13975.896, Time=1.86 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=6.81 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=3.40 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=17184.371, Time=1.32 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=13862.739, Time=2.47 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=14221.576, Time=0.92 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=9.20 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=3.90 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=13847.801, Time=3.03 sec
 ARIMA(3,0,0)(1,1,0)[7] intercept   : AIC=14216.183, Time=1.09 sec
 ARIMA(3,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=10.96 sec
 ARIMA(3,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=5.55 sec
 ARIMA(3,0,1)(2,1,0)[7] intercept   : AIC=13849.551, Time=5.15 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=13849.231, Time=3.97 sec
 ARIMA(3,0,0)(2,1,0)[7]             : AIC=13845.827, Time=0.87 sec
 ARIMA(3,0,0)(1,1,0)[7]             : AIC=14214.208, Time=0.39 sec
 ARIMA(3,0,0)(2,1,1)[7]             : AIC=inf, Time=4.13 sec
 ARIMA(3,0,0)(1,1,1)[7]             : AIC=inf, Time=2.20 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=13860.769, Time=0.68 sec
 ARIMA(3,0,1)(2,1,0)[7]             : AIC=13847.576, Time=1.87 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=13847.258, Time=1.14 sec

Best model:  ARIMA(3,0,0)(2,1,0)[7]          
Total fit time: 73.484 seconds

We generate variables with the dates associates to the testing and forecasting period, since we need them for plotting proposes.

dates = df['T']['date'].to_numpy()

# Test dates
test_dates = dates[(len(dates) - test_window):]
test_dates = list(test_dates)
test_dates = [test_dates[i].astype(datetime.datetime) for i in range(len(test_dates))]

# Forecast dates
forecasting_dates = []
last_date = df['T']['date'][len(dates) - 1]
for i in range(1, forecast_window + 1):
    forecasting_dates.append(last_date + timedelta(days=i))

# An equivalent way to extract the prediction dates
'''
prediction_dates = []
given_date = temp_day_df['date'][len(temp_day_df) - test_window - 1]
for i in range(1, test_window + forecast_window + 1):
    prediction_dates.append(given_date + timedelta(days=i))
'''

"\nprediction_dates = []\ngiven_date = temp_day_df['date'][len(temp_day_df) - test_window - 1]\nfor i in range(1, test_window + forecast_window + 1):\n    prediction_dates.append(given_date + timedelta(days=i))\n"

Predictive visualization#

The following plot is an smart visualization of how each tried model predicts the testing set and forecast the future, using the same test and forecast window (15 days).

It’s important to notice that our main aim uis to find a good model for forecasting the temperature 15 days in advanced.

# Plotting test and future forecast

predicted_values = Y_hat
prediction_dates = test_dates + forecasting_dates

predictive_time_series_plot(n_cols=3, figsize=(20,30), 
                            data=df['T'].filter(pl.col('year')==2016, pl.col('month').is_in([10, 11, 12])), 
                            x_name='date', y_name='T', true_color='blue', pred_color='red',
                            predicted_values=predicted_values, prediction_dates=prediction_dates, test_window=test_window, scores=scores, score_name='MAE',
                            title=f"Forecasting Daily Avg. Temperature (ºC) - Forecast window of {forecast_window} days", 
                            title_size=13, title_weight='bold', subtitles_size=9,
                            marker='', markersize=5, ylabel='Avg.Temperature', xlabel='Date',
                            xticks_size=9, hspace=0.7, wspace=0.13, xticks_rotation=20, xlabel_size=11, ylabel_size=11,
                            title_height=0.9, shadow_alpha=0.15, shadow_color='green', legend_size=10, bbox_to_anchor=(0.5,0.065))

_images/bb53bdde7153601622530497462b6a42de812112e5c4fbb3447f3e05a1147576.png

''''
# Only plotting future forecast 

predicted_values = Y_future_hat
prediction_dates = forecasting_dates
test_window = None

predictive_time_series_plot(n_cols=3, figsize=(15,15), 
                            data=temp_day_df.filter(pl.col('year')==2016, pl.col('month').is_in([10, 11, 12])), 
                            x_name='date', y_name='T', true_color='blue', pred_color='red',
                            predicted_values=predicted_values, prediction_dates=prediction_dates, test_window=test_window, scores=score, score_name='MAE',
                            title=f"Forecasting Daily Avg.Temperature - Forecast window of {forecast_window} days", 
                            title_size=13, title_weight='bold', subtitles_size=9,
                            marker='', markersize=5, ylabel='Avg.Temperature', xlabel='Date',
                            xticks_size=9, hspace=0.7, wspace=0.13, xticks_rotation=20, xlabel_size=11, ylabel_size=11,
                            title_height=0.93, shadow_alpha=0.15, shadow_color='green', legend_size=10)
'''

'\'\n# Only plotting future forecast \n\npredicted_values = Y_future_hat\nprediction_dates = forecasting_dates\ntest_window = None\n\npredictive_time_series_plot(n_cols=3, figsize=(15,15), \n                            data=temp_day_df.filter(pl.col(\'year\')==2016, pl.col(\'month\').is_in([10, 11, 12])), \n                            x_name=\'date\', y_name=\'T\', true_color=\'blue\', pred_color=\'red\',\n                            predicted_values=predicted_values, prediction_dates=prediction_dates, test_window=test_window, scores=score, score_name=\'MAE\',\n                            title=f"Forecasting Daily Avg.Temperature - Forecast window of {forecast_window} days", \n                            title_size=13, title_weight=\'bold\', subtitles_size=9,\n                            marker=\'\', markersize=5, ylabel=\'Avg.Temperature\', xlabel=\'Date\',\n                            xticks_size=9, hspace=0.7, wspace=0.13, xticks_rotation=20, xlabel_size=11, ylabel_size=11,\n                            title_height=0.93, shadow_alpha=0.15, shadow_color=\'green\', legend_size=10)\n'

'''
# Only plotting test forecast

predicted_values = Y_test_hat
prediction_dates = test_dates 
shadow_alpha = 0

predictive_time_series_plot(n_cols=3, figsize=(15,15), 
                            data=temp_day_df.filter(pl.col('year')==2016, pl.col('month').is_in([10, 11, 12])), 
                            x_name='date', y_name='T', true_color='blue', pred_color='red',
                            predicted_values=predicted_values, prediction_dates=prediction_dates, test_window=test_window, scores=score, score_name='MAE',
                            title=f"Forecasting Daily Avg.Temperature - Forecast window of {forecast_window} days", 
                            title_size=13, title_weight='bold', subtitles_size=9,
                            marker='', markersize=5, ylabel='Avg.Temperature', xlabel='Date',
                            xticks_size=9, hspace=0.7, wspace=0.13, xticks_rotation=20, xlabel_size=11, ylabel_size=11,
                            title_height=0.93, shadow_alpha=shadow_alpha, shadow_color='green', legend_size=10)
'''

'\n# Only plotting test forecast\n\npredicted_values = Y_test_hat\nprediction_dates = test_dates \nshadow_alpha = 0\n\npredictive_time_series_plot(n_cols=3, figsize=(15,15), \n                            data=temp_day_df.filter(pl.col(\'year\')==2016, pl.col(\'month\').is_in([10, 11, 12])), \n                            x_name=\'date\', y_name=\'T\', true_color=\'blue\', pred_color=\'red\',\n                            predicted_values=predicted_values, prediction_dates=prediction_dates, test_window=test_window, scores=score, score_name=\'MAE\',\n                            title=f"Forecasting Daily Avg.Temperature - Forecast window of {forecast_window} days", \n                            title_size=13, title_weight=\'bold\', subtitles_size=9,\n                            marker=\'\', markersize=5, ylabel=\'Avg.Temperature\', xlabel=\'Date\',\n                            xticks_size=9, hspace=0.7, wspace=0.13, xticks_rotation=20, xlabel_size=11, ylabel_size=11,\n                            title_height=0.93, shadow_alpha=shadow_alpha, shadow_color=\'green\', legend_size=10)\n'

Selecting the best model#

Given the previous results, in this section we get the best model, that is, the one with the least inner error.

model_names = list(scores.keys())
inner_scores_values = np.array(list(scores.values()))
best_model_SV = model_names[np.argmin(inner_scores_values)]

plt.figure(figsize=(7, 10))
ax = sns.scatterplot(x=inner_scores_values, y=model_names, color='blue', s=95)
ax = sns.scatterplot(x=np.min(inner_scores_values), 
                     y=[best_model_SV], color='red', s=95)
plt.title(f'Model Selection - Daily Temperature (ºC) \n\n Simple Validation - Test window {test_window} days ', size=14, weight='bold')
ax.set_ylabel('Models', size=13)
ax.set_xlabel('MAE', size=11)
min = np.min(inner_scores_values)
max = np.max(inner_scores_values)
plt.xticks(np.round(np.linspace(min,max, 5), 3), fontsize=10)
plt.yticks(fontsize=12)
plt.show()
print('The best model according to simple validation is', best_model_SV)

_images/f8eded9f004badb0f7ade8c24bd866d69f462d55f1789a75f0a91a81a66696f1.png

The best model according to simple validation is knn (lag=30)

According to simple validation with a test window fo 15 days, the best model for forecasting the temperature in Jena is the KNN algorithm with 30 lags.

As we can see the lags have a crucial influence in the forecasting results.

Cross Validation#

We perform the same as before but using K-Fold cross validation, with \(K=10\).

This is a much more accurate approach to estimate the (inner) error of a model, and this is specially true in time series where the forecast error could be very dependent on the train-validate partitions. This method makes the estimation less dependent on that, since the model is trained and evaluated in different partitions, making the estimation more robust and precise.

n_splits = 10
scores = {}
series = 'T'

Computing inner score by K-Fold CV for statsmodels implementation

for name, model in zip(st_models[series].keys(), st_models[series].values()):
    print(name)

    scores[name] = KFold_score_time_series(estimator=model, 
                                          X=X_train_st[series], y=Y_train_st[series], 
                                          n_splits=n_splits, test_window=test_window, 
                                          scoring=mean_absolute_error)                                    

SARIMA(D=1, P=1, p=2, s=7)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(D=1, P=2, p=2, s=7)
Fold's size: 290. Train size: 275. Test size: 15

SARIMA(D=2, P=2, p=2, s=7)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(D=2, P=1, p=2, s=7)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(D=1, P=1, p=2, s=14)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(D=1, P=1, p=2, s=30)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=0.05)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=0.5)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=0.8)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=1.5)
Fold's size: 290. Train size: 275. Test size: 15

Computing inner score by K-Fold CV for sklearn implementation

for lag in lags_grid:
    print(lag)

    for name, model in zip(sk_models.keys(), sk_models.values()):
        print(name)

        scores[name + f' (lag={lag})'] = KFold_score_time_series(estimator=model, 
                                            X=X_train_sk[series][lag], y=Y_train_sk[series][lag], 
                                            n_splits=n_splits, test_window=test_window, 
                                            scoring=mean_absolute_error)                                    

1
Linear Regression
Fold's size: 290. Train size: 275. Test size: 15
knn
Fold's size: 290. Train size: 275. Test size: 15
2
Linear Regression
Fold's size: 290. Train size: 275. Test size: 15
knn
Fold's size: 290. Train size: 275. Test size: 15
3
Linear Regression
Fold's size: 290. Train size: 275. Test size: 15
knn
Fold's size: 290. Train size: 275. Test size: 15
7
Linear Regression
Fold's size: 289. Train size: 274. Test size: 15
knn
Fold's size: 289. Train size: 274. Test size: 15
10
Linear Regression
Fold's size: 289. Train size: 274. Test size: 15
knn
Fold's size: 289. Train size: 274. Test size: 15
20
Linear Regression
Fold's size: 288. Train size: 273. Test size: 15
knn
Fold's size: 288. Train size: 273. Test size: 15

30
Linear Regression
Fold's size: 287. Train size: 272. Test size: 15
knn
Fold's size: 287. Train size: 272. Test size: 15
40
Linear Regression
Fold's size: 286. Train size: 271. Test size: 15
knn
Fold's size: 286. Train size: 271. Test size: 15

Computing inner score by K-Fold CV for pdmarima implementation

name = 'Auto SARIMA'

auto_sarima = autoSARIMA(seasonal=True, m=7, d=0, D=1, start_p=0, start_q=0, max_p=3, max_q=3,
                         suppress_warnings=True, stepwise=True, trace=True)

scores[name] = KFold_score_time_series(estimator=auto_sarima, 
                                          X=X_train_st[series], y=Y_train_st[series], 
                                          n_splits=n_splits, test_window=test_window, 
                                          scoring=mean_absolute_error)                                    

Fold's size: 290. Train size: 275. Test size: 15
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1482.445, Time=0.16 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1525.761, Time=0.01 sec

 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1294.899, Time=0.14 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1305.493, Time=0.14 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1527.658, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1339.845, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1273.720, Time=0.29 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=1235.321, Time=0.62 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=1234.282, Time=0.25 sec
 ARIMA(1,0,0)(0,1,1)[7] intercept   : AIC=1233.493, Time=0.16 sec
 ARIMA(1,0,0)(0,1,2)[7] intercept   : AIC=1234.110, Time=0.27 sec
 ARIMA(1,0,0)(1,1,2)[7] intercept   : AIC=inf, Time=0.64 sec
 ARIMA(0,0,0)(0,1,1)[7] intercept   : AIC=1481.386, Time=0.08 sec
 ARIMA(2,0,0)(0,1,1)[7] intercept   : AIC=1232.437, Time=0.22 sec
 ARIMA(2,0,0)(0,1,0)[7] intercept   : AIC=1327.651, Time=0.05 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=1232.780, Time=0.32 sec
 ARIMA(2,0,0)(0,1,2)[7] intercept   : AIC=1232.608, Time=0.39 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=1284.345, Time=0.17 sec
 ARIMA(2,0,0)(1,1,2)[7] intercept   : AIC=inf, Time=0.72 sec
 ARIMA(3,0,0)(0,1,1)[7] intercept   : AIC=1221.996, Time=0.25 sec
 ARIMA(3,0,0)(0,1,0)[7] intercept   : AIC=1325.443, Time=0.08 sec
 ARIMA(3,0,0)(1,1,1)[7] intercept   : AIC=1222.574, Time=0.54 sec
 ARIMA(3,0,0)(0,1,2)[7] intercept   : AIC=1222.457, Time=0.50 sec
 ARIMA(3,0,0)(1,1,0)[7] intercept   : AIC=1279.174, Time=0.19 sec
 ARIMA(3,0,0)(1,1,2)[7] intercept   : AIC=inf, Time=0.74 sec
 ARIMA(3,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.87 sec
 ARIMA(2,0,1)(0,1,1)[7] intercept   : AIC=1228.654, Time=0.25 sec
 ARIMA(3,0,0)(0,1,1)[7]             : AIC=1226.271, Time=0.17 sec

Best model:  ARIMA(3,0,0)(0,1,1)[7] intercept
Total fit time: 8.215 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1644.332, Time=0.12 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1711.413, Time=0.02 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1362.251, Time=0.09 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1441.573, Time=0.15 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1710.958, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1442.680, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1325.655, Time=0.27 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.13 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.67 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=1633.670, Time=0.31 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1318.056, Time=0.25 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=1352.539, Time=0.10 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.65 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.80 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=1318.406, Time=0.51 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=1319.040, Time=0.53 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=1317.040, Time=0.40 sec
 ARIMA(1,0,1)(1,1,0)[7] intercept   : AIC=1351.879, Time=0.15 sec
 ARIMA(1,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.32 sec
 ARIMA(1,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.74 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=1436.274, Time=0.30 sec
 ARIMA(1,0,2)(2,1,0)[7] intercept   : AIC=1319.039, Time=0.45 sec
 ARIMA(0,0,2)(2,1,0)[7] intercept   : AIC=1365.407, Time=0.38 sec
 ARIMA(2,0,2)(2,1,0)[7] intercept   : AIC=1320.959, Time=0.89 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=1316.369, Time=0.16 sec
 ARIMA(1,0,1)(1,1,0)[7]             : AIC=1350.624, Time=0.07 sec
 ARIMA(1,0,1)(2,1,1)[7]             : AIC=inf, Time=0.59 sec
 ARIMA(1,0,1)(1,1,1)[7]             : AIC=inf, Time=0.22 sec
 ARIMA(0,0,1)(2,1,0)[7]             : AIC=1440.136, Time=0.17 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=1324.704, Time=0.13 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=1318.363, Time=0.27 sec
 ARIMA(1,0,2)(2,1,0)[7]             : AIC=1318.358, Time=0.21 sec
 ARIMA(0,0,0)(2,1,0)[7]             : AIC=1639.408, Time=0.08 sec
 ARIMA(0,0,2)(2,1,0)[7]             : AIC=1367.708, Time=0.14 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=1317.465, Time=0.17 sec
 ARIMA(2,0,2)(2,1,0)[7]             : AIC=1320.311, Time=0.36 sec

Best model:  ARIMA(1,0,1)(2,1,0)[7]          
Total fit time: 13.802 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1649.139, Time=0.08 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1693.728, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1390.372, Time=0.09 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1433.464, Time=0.24 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1692.046, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1456.508, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1340.008, Time=0.25 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.52 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.66 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=1644.116, Time=0.43 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1326.600, Time=0.41 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=1371.540, Time=0.13 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.71 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.79 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=1323.976, Time=0.39 sec
 ARIMA(3,0,0)(1,1,0)[7] intercept   : AIC=1370.455, Time=0.20 sec
 ARIMA(3,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.60 sec
 ARIMA(3,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=1.32 sec
 ARIMA(3,0,1)(2,1,0)[7] intercept   : AIC=1323.217, Time=0.97 sec
 ARIMA(3,0,1)(1,1,0)[7] intercept   : AIC=1371.685, Time=0.42 sec
 ARIMA(3,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.83 sec
 ARIMA(3,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=1.09 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=1321.266, Time=0.53 sec
 ARIMA(2,0,1)(1,1,0)[7] intercept   : AIC=1369.884, Time=0.26 sec
 ARIMA(2,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.11 sec
 ARIMA(2,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=1.00 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=1321.651, Time=0.31 sec
 ARIMA(2,0,2)(2,1,0)[7] intercept   : AIC=1323.220, Time=0.86 sec
 ARIMA(1,0,2)(2,1,0)[7] intercept   : AIC=1321.686, Time=0.52 sec
 ARIMA(3,0,2)(2,1,0)[7] intercept   : AIC=1325.217, Time=1.08 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=1319.312, Time=0.21 sec
 ARIMA(2,0,1)(1,1,0)[7]             : AIC=1367.929, Time=0.16 sec
 ARIMA(2,0,1)(2,1,1)[7]             : AIC=inf, Time=1.00 sec
 ARIMA(2,0,1)(1,1,1)[7]             : AIC=inf, Time=0.66 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=1319.710, Time=0.14 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=1324.664, Time=0.15 sec
 ARIMA(3,0,1)(2,1,0)[7]             : AIC=1321.265, Time=0.23 sec
 ARIMA(2,0,2)(2,1,0)[7]             : AIC=1321.268, Time=0.32 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=1338.034, Time=0.15 sec
 ARIMA(1,0,2)(2,1,0)[7]             : AIC=1319.728, Time=0.20 sec
 ARIMA(3,0,0)(2,1,0)[7]             : AIC=1322.016, Time=0.20 sec
 ARIMA(3,0,2)(2,1,0)[7]             : AIC=1323.265, Time=0.52 sec

Best model:  ARIMA(2,0,1)(2,1,0)[7]          
Total fit time: 23.785 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.35 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1630.187, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1347.710, Time=0.13 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1422.393, Time=0.12 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1629.171, Time=0.02 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1391.541, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1320.977, Time=0.30 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.12 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.70 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=1580.670, Time=0.42 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1319.995, Time=0.33 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=1345.912, Time=0.16 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.33 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.57 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=1321.708, Time=0.40 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=1321.917, Time=0.55 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=1319.941, Time=0.37 sec
 ARIMA(1,0,1)(1,1,0)[7] intercept   : AIC=1346.067, Time=0.13 sec
 ARIMA(1,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.83 sec
 ARIMA(1,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.68 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=1430.854, Time=0.33 sec
 ARIMA(1,0,2)(2,1,0)[7] intercept   : AIC=1321.807, Time=0.45 sec
 ARIMA(0,0,2)(2,1,0)[7] intercept   : AIC=1361.337, Time=0.35 sec
 ARIMA(2,0,2)(2,1,0)[7] intercept   : AIC=1321.506, Time=0.85 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=1318.452, Time=0.14 sec
 ARIMA(1,0,1)(1,1,0)[7]             : AIC=1344.214, Time=0.08 sec
 ARIMA(1,0,1)(2,1,1)[7]             : AIC=1287.892, Time=0.45 sec
 ARIMA(1,0,1)(1,1,1)[7]             : AIC=1286.013, Time=0.22 sec
 ARIMA(1,0,1)(0,1,1)[7]             : AIC=1286.840, Time=0.16 sec
 ARIMA(1,0,1)(1,1,2)[7]             : AIC=1287.793, Time=0.58 sec
 ARIMA(1,0,1)(0,1,0)[7]             : AIC=1388.126, Time=0.02 sec
 ARIMA(1,0,1)(0,1,2)[7]             : AIC=1285.900, Time=0.37 sec
 ARIMA(0,0,1)(0,1,2)[7]             : AIC=1428.478, Time=0.20 sec
 ARIMA(1,0,0)(0,1,2)[7]             : AIC=inf, Time=0.25 sec
 ARIMA(2,0,1)(0,1,2)[7]             : AIC=1287.680, Time=0.47 sec
 ARIMA(1,0,2)(0,1,2)[7]             : AIC=inf, Time=0.63 sec
 ARIMA(0,0,0)(0,1,2)[7]             : AIC=1589.629, Time=0.13 sec
 ARIMA(0,0,2)(0,1,2)[7]             : AIC=1355.117, Time=0.26 sec
 ARIMA(2,0,0)(0,1,2)[7]             : AIC=1286.070, Time=0.35 sec
 ARIMA(2,0,2)(0,1,2)[7]             : AIC=1289.559, Time=1.15 sec
 ARIMA(1,0,1)(0,1,2)[7] intercept   : AIC=inf, Time=1.08 sec

Best model:  ARIMA(1,0,1)(0,1,2)[7]          
Total fit time: 18.047 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1594.721, Time=0.11 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1659.649, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1358.406, Time=0.15 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1389.265, Time=0.15 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1658.276, Time=0.02 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1401.854, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1318.779, Time=0.20 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=0.74 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.55 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=1579.998, Time=0.45 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1289.217, Time=0.31 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=1321.541, Time=0.13 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=1268.233, Time=0.56 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.62 sec
 ARIMA(2,0,0)(2,1,2)[7] intercept   : AIC=1269.920, Time=0.82 sec
 ARIMA(2,0,0)(1,1,2)[7] intercept   : AIC=inf, Time=0.91 sec
 ARIMA(3,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.12 sec
 ARIMA(2,0,1)(2,1,1)[7] intercept   : AIC=1267.697, Time=0.70 sec
 ARIMA(2,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.78 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=1290.732, Time=0.50 sec
 ARIMA(2,0,1)(2,1,2)[7] intercept   : AIC=inf, Time=1.75 sec
 ARIMA(2,0,1)(1,1,0)[7] intercept   : AIC=1323.337, Time=0.20 sec
 ARIMA(2,0,1)(1,1,2)[7] intercept   : AIC=inf, Time=1.33 sec
 ARIMA(1,0,1)(2,1,1)[7] intercept   : AIC=1265.809, Time=0.49 sec
 ARIMA(1,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.59 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=1290.833, Time=0.28 sec
 ARIMA(1,0,1)(2,1,2)[7] intercept   : AIC=inf, Time=1.76 sec
 ARIMA(1,0,1)(1,1,0)[7] intercept   : AIC=1325.111, Time=0.13 sec
 ARIMA(1,0,1)(1,1,2)[7] intercept   : AIC=inf, Time=1.09 sec
 ARIMA(0,0,1)(2,1,1)[7] intercept   : AIC=1382.698, Time=0.40 sec
 ARIMA(1,0,2)(2,1,1)[7] intercept   : AIC=1267.643, Time=0.63 sec
 ARIMA(0,0,0)(2,1,1)[7] intercept   : AIC=1581.648, Time=0.31 sec
 ARIMA(0,0,2)(2,1,1)[7] intercept   : AIC=1307.573, Time=0.51 sec
 ARIMA(2,0,2)(2,1,1)[7] intercept   : AIC=inf, Time=1.90 sec
 ARIMA(1,0,1)(2,1,1)[7]             : AIC=1264.293, Time=0.37 sec
 ARIMA(1,0,1)(1,1,1)[7]             : AIC=inf, Time=0.28 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=1289.176, Time=0.12 sec
 ARIMA(1,0,1)(2,1,2)[7]             : AIC=1266.283, Time=1.10 sec
 ARIMA(1,0,1)(1,1,0)[7]             : AIC=1323.293, Time=0.06 sec
 ARIMA(1,0,1)(1,1,2)[7]             : AIC=inf, Time=0.59 sec
 ARIMA(0,0,1)(2,1,1)[7]             : AIC=1382.413, Time=0.18 sec
 ARIMA(1,0,0)(2,1,1)[7]             : AIC=inf, Time=0.52 sec
 ARIMA(2,0,1)(2,1,1)[7]             : AIC=1266.192, Time=0.51 sec
 ARIMA(1,0,2)(2,1,1)[7]             : AIC=1266.141, Time=0.50 sec
 ARIMA(0,0,0)(2,1,1)[7]             : AIC=1581.806, Time=0.14 sec
 ARIMA(0,0,2)(2,1,1)[7]             : AIC=1306.831, Time=0.26 sec
 ARIMA(2,0,0)(2,1,1)[7]             : AIC=1266.758, Time=0.42 sec
 ARIMA(2,0,2)(2,1,1)[7]             : AIC=inf, Time=1.59 sec

Best model:  ARIMA(1,0,1)(2,1,1)[7]          
Total fit time: 26.874 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1621.519, Time=0.13 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1660.443, Time=0.02 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1272.520, Time=0.12 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1398.595, Time=0.14 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1658.814, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1350.803, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1238.851, Time=0.29 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=1208.123, Time=0.65 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=1206.385, Time=0.18 sec
 ARIMA(1,0,0)(0,1,1)[7] intercept   : AIC=1204.813, Time=0.14 sec
 ARIMA(1,0,0)(0,1,2)[7] intercept   : AIC=1206.420, Time=0.32 sec
 ARIMA(1,0,0)(1,1,2)[7] intercept   : AIC=1208.339, Time=0.65 sec
 ARIMA(0,0,0)(0,1,1)[7] intercept   : AIC=1622.832, Time=0.07 sec
 ARIMA(2,0,0)(0,1,1)[7] intercept   : AIC=1200.593, Time=0.22 sec
 ARIMA(2,0,0)(0,1,0)[7] intercept   : AIC=1331.909, Time=0.03 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=1202.554, Time=0.33 sec
 ARIMA(2,0,0)(0,1,2)[7] intercept   : AIC=1202.559, Time=0.59 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=1259.996, Time=0.16 sec
 ARIMA(2,0,0)(1,1,2)[7] intercept   : AIC=1204.473, Time=1.33 sec
 ARIMA(3,0,0)(0,1,1)[7] intercept   : AIC=1201.527, Time=0.26 sec
 ARIMA(2,0,1)(0,1,1)[7] intercept   : AIC=1201.109, Time=0.41 sec
 ARIMA(1,0,1)(0,1,1)[7] intercept   : AIC=1199.796, Time=0.25 sec
 ARIMA(1,0,1)(0,1,0)[7] intercept   : AIC=1334.553, Time=0.04 sec
 ARIMA(1,0,1)(1,1,1)[7] intercept   : AIC=1201.756, Time=0.32 sec
 ARIMA(1,0,1)(0,1,2)[7] intercept   : AIC=1201.762, Time=0.48 sec
 ARIMA(1,0,1)(1,1,0)[7] intercept   : AIC=1259.869, Time=0.15 sec
 ARIMA(1,0,1)(1,1,2)[7] intercept   : AIC=1203.666, Time=0.92 sec
 ARIMA(1,0,2)(0,1,1)[7] intercept   : AIC=1201.266, Time=0.36 sec
 ARIMA(0,0,2)(0,1,1)[7] intercept   : AIC=1315.670, Time=0.19 sec
 ARIMA(2,0,2)(0,1,1)[7] intercept   : AIC=inf, Time=0.88 sec
 ARIMA(1,0,1)(0,1,1)[7]             : AIC=1201.155, Time=0.15 sec

Best model:  ARIMA(1,0,1)(0,1,1)[7] intercept
Total fit time: 9.832 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1562.805, Time=0.18 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1641.069, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1320.581, Time=0.10 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1370.869, Time=0.18 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1639.576, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1412.133, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1293.105, Time=0.30 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.12 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.60 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=1567.538, Time=0.43 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1283.087, Time=0.27 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=1310.835, Time=0.15 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.59 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.62 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=1282.986, Time=0.43 sec
 ARIMA(3,0,0)(1,1,0)[7] intercept   : AIC=1311.989, Time=0.17 sec
 ARIMA(3,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.70 sec
 ARIMA(3,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=1.02 sec
 ARIMA(3,0,1)(2,1,0)[7] intercept   : AIC=1284.521, Time=1.24 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=1283.383, Time=0.56 sec
 ARIMA(3,0,0)(2,1,0)[7]             : AIC=1281.327, Time=0.19 sec
 ARIMA(3,0,0)(1,1,0)[7]             : AIC=1310.234, Time=0.08 sec
 ARIMA(3,0,0)(2,1,1)[7]             : AIC=inf, Time=0.55 sec
 ARIMA(3,0,0)(1,1,1)[7]             : AIC=inf, Time=0.37 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=1281.464, Time=0.23 sec
 ARIMA(3,0,1)(2,1,0)[7]             : AIC=1282.844, Time=0.38 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=1281.739, Time=0.17 sec

Best model:  ARIMA(3,0,0)(2,1,0)[7]          
Total fit time: 12.667 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1476.830, Time=0.12 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1536.876, Time=0.02 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1285.233, Time=0.11 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1324.708, Time=0.10 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1535.821, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1359.050, Time=0.03 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1251.765, Time=0.26 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=1213.163, Time=0.51 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=1212.192, Time=0.22 sec
 ARIMA(1,0,0)(0,1,1)[7] intercept   : AIC=1211.917, Time=0.17 sec
 ARIMA(1,0,0)(0,1,2)[7] intercept   : AIC=1211.928, Time=0.37 sec
 ARIMA(1,0,0)(1,1,2)[7] intercept   : AIC=1213.466, Time=0.63 sec
 ARIMA(0,0,0)(0,1,1)[7] intercept   : AIC=1475.288, Time=0.06 sec
 ARIMA(2,0,0)(0,1,1)[7] intercept   : AIC=1213.615, Time=0.15 sec
 ARIMA(1,0,1)(0,1,1)[7] intercept   : AIC=1213.542, Time=0.19 sec
 ARIMA(2,0,1)(0,1,1)[7] intercept   : AIC=1234.236, Time=0.75 sec
 ARIMA(1,0,0)(0,1,1)[7]             : AIC=1213.703, Time=0.08 sec

Best model:  ARIMA(1,0,0)(0,1,1)[7] intercept
Total fit time: 3.766 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1630.439, Time=0.13 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1722.277, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1380.524, Time=0.10 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.31 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1720.611, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1440.191, Time=0.03 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1330.035, Time=0.31 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.00 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.47 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=1625.204, Time=0.52 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1328.332, Time=0.24 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=1375.110, Time=0.10 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.48 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.78 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=1328.447, Time=0.33 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=1325.235, Time=0.87 sec
 ARIMA(2,0,1)(1,1,0)[7] intercept   : AIC=1374.079, Time=0.35 sec
 ARIMA(2,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.57 sec
 ARIMA(2,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.53 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=1327.136, Time=0.34 sec
 ARIMA(3,0,1)(2,1,0)[7] intercept   : AIC=1327.019, Time=0.63 sec
 ARIMA(2,0,2)(2,1,0)[7] intercept   : AIC=1327.079, Time=0.76 sec
 ARIMA(1,0,2)(2,1,0)[7] intercept   : AIC=1326.706, Time=0.42 sec
 ARIMA(3,0,2)(2,1,0)[7] intercept   : AIC=1320.071, Time=1.84 sec
 ARIMA(3,0,2)(1,1,0)[7] intercept   : AIC=inf, Time=1.20 sec
 ARIMA(3,0,2)(2,1,1)[7] intercept   : AIC=inf, Time=1.99 sec
 ARIMA(3,0,2)(1,1,1)[7] intercept   : AIC=inf, Time=1.19 sec
 ARIMA(3,0,3)(2,1,0)[7] intercept   : AIC=inf, Time=2.15 sec
 ARIMA(2,0,3)(2,1,0)[7] intercept   : AIC=1322.710, Time=0.74 sec
 ARIMA(3,0,2)(2,1,0)[7]             : AIC=inf, Time=1.10 sec

Best model:  ARIMA(3,0,2)(2,1,0)[7] intercept
Total fit time: 21.522 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1552.473, Time=0.13 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1605.460, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1262.900, Time=0.14 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1339.163, Time=0.17 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1603.481, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1328.792, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1238.664, Time=0.27 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.23 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.77 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=1551.282, Time=0.26 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1209.429, Time=0.32 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=1230.647, Time=0.15 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.28 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.86 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=1208.465, Time=0.35 sec
 ARIMA(3,0,0)(1,1,0)[7] intercept   : AIC=1231.730, Time=0.20 sec
 ARIMA(3,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=2.05 sec
 ARIMA(3,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.94 sec
 ARIMA(3,0,1)(2,1,0)[7] intercept   : AIC=1210.012, Time=0.59 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=1208.012, Time=0.44 sec
 ARIMA(2,0,1)(1,1,0)[7] intercept   : AIC=1231.683, Time=0.21 sec
 ARIMA(2,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.52 sec
 ARIMA(2,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.89 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=1206.397, Time=0.37 sec
 ARIMA(1,0,1)(1,1,0)[7] intercept   : AIC=1232.149, Time=0.18 sec
 ARIMA(1,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.59 sec
 ARIMA(1,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.89 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=1334.662, Time=0.33 sec
 ARIMA(1,0,2)(2,1,0)[7] intercept   : AIC=1208.012, Time=0.35 sec
 ARIMA(0,0,2)(2,1,0)[7] intercept   : AIC=1248.956, Time=0.38 sec
 ARIMA(2,0,2)(2,1,0)[7] intercept   : AIC=1208.983, Time=0.95 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=1204.415, Time=0.10 sec
 ARIMA(1,0,1)(1,1,0)[7]             : AIC=1230.152, Time=0.05 sec
 ARIMA(1,0,1)(2,1,1)[7]             : AIC=inf, Time=1.06 sec
 ARIMA(1,0,1)(1,1,1)[7]             : AIC=inf, Time=0.51 sec
 ARIMA(0,0,1)(2,1,0)[7]             : AIC=1332.689, Time=0.12 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=1236.676, Time=0.12 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=1206.032, Time=0.21 sec
 ARIMA(1,0,2)(2,1,0)[7]             : AIC=1206.031, Time=0.19 sec
 ARIMA(0,0,0)(2,1,0)[7]             : AIC=1549.298, Time=0.07 sec
 ARIMA(0,0,2)(2,1,0)[7]             : AIC=1246.976, Time=0.15 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=1207.455, Time=0.13 sec
 ARIMA(2,0,2)(2,1,0)[7]             : AIC=1207.000, Time=0.44 sec

Best model:  ARIMA(1,0,1)(2,1,0)[7]          
Total fit time: 21.030 seconds

Selecting the best model#

Given the previous results, in this section we get the best model, that is, the one with the least inner error.

model_names = np.array(list(scores.keys()))
inner_scores_values = np.array(list(scores.values()))
best_model_CV = model_names[np.argmin(inner_scores_values)]

plt.figure(figsize=(7, 11))
ax = sns.scatterplot(x=inner_scores_values, y=model_names, color='blue', s=95)
ax = sns.scatterplot(x=np.min(inner_scores_values), 
                     y=[best_model_CV], color='red', s=95)
plt.title(f'Model Selection - Daily Temperature (ºC) \n\n {n_splits} Fold CV - Test window {test_window} days', size=15, weight='bold')
ax.set_ylabel('Models', size=13)
ax.set_xlabel('MAE', size=11)
min = np.min(inner_scores_values)
max = np.max(inner_scores_values)
plt.xticks(np.round(np.linspace(min,max, 5), 3), fontsize=10)
plt.yticks(fontsize=12)
plt.show()
print('The best model according to cross validation is', best_model_CV)

_images/fd6919681ebfec8bbae6449043d81f9d26349e92fa4f4026d713df833ff4e9fe.png

The best model according to cross validation is Linear Regression (lag=30)

According to cross validation with a test window fo 15 days, the best model for forecasting the temperature in Jena is the Linear Regression algorithm with 30 lags.

We can see again that the lags have a crucial influence in the forecasting results.

Predictive visualization#

In this section we are going to plot the K-Fold cross validation process applied above for some of the models, just for getting more insight of how it works, and how dependent are the forecast on the train-validate partitions (namely, on the folds).

\(\text{SARIMA}(p=1, d=0, q=0)\times(P=1, D=1, Q=0)_{s=7}\)

estimator = sarima[series][2]
model_name = str(sarima[series][2])

KFold_time_series_plot(n_cols=3, figsize=(20,15), estimator=estimator, 
                       X=X_train_st[series], y=Y_train_st[series], n_splits=n_splits, test_window=test_window, score=mean_absolute_error, dates=dates,
                       true_color='blue', pred_color='red', marker='', markersize=4,
                       title=f"Forecasting Daily Avg. Temperature (ºC) - Forecast window of {forecast_window} days - {n_splits} Fold CV - MAE = {np.round(scores[model_name],3)}\n\n{model_name}",
                       hspace=0.35, wspace=0.1, subtitles_size=12, title_size=15, xticks_rotation=25, 
                       bbox_to_anchor=(0.5,0.05))

Fold's size: 290. Train size: 275. Test size: 15

_images/dddb45e4101a1cd40784f50c28211a6c1bddb3bca313ef3c628cff9f0fb56262.png

Linear regression with 30 lags

estimator = linear_regression
lag = 30
model_name = f'Linear Regression (lag={lag})'

KFold_time_series_plot(n_cols=3, figsize=(20,15), estimator=estimator,  score=mean_absolute_error,
                       X=X_train_sk[series][lag], y=Y_train_sk[series][lag], n_splits=n_splits, test_window=test_window, dates=dates,
                       true_color='blue', pred_color='red', marker='', markersize=4,
                       title=f"Forecasting Daily Avg. Temperature (ºC) - Forecast window of {forecast_window} days - {n_splits} Fold CV - MAE = {np.round(scores[model_name],3)}\n\n{model_name}",
                       hspace=0.35, wspace=0.1, subtitles_size=12, title_size=15, xticks_rotation=25, 
                       bbox_to_anchor=(0.5,0.05))

Fold's size: 287. Train size: 272. Test size: 15

_images/e906fcdd5473cc25d52d1829dd6b42a8a7cf6c69fc98ea41ed51fd21e32ad2e8.png

Estimation of future performance#

We estimate the future performance of the best model according to KFold Cross Validation.

This estimates the performance that the best model will have in forecasting the future, but has to be taken carefully since is very dependent on the slot of testing data used, specially in time series.

So the future performance of a model could be very good for some kinds of test date but much worse for other.

For example, if the testing set contains a shock the performance of the model will probably be worse than it was in the validation part, that is, the estimation of future performance will be worse than the cross validation error, just because we are forecasting a really unpredictable period (a shock).

best_model_CV

'Linear Regression (lag=30)'

lag = 30
linear_regression.fit(X=X_train_sk[series][lag], y=Y_train_sk[series][lag])
Y_test_hat = linear_regression.forecast(window=test_window)
estimation_future_performance = mean_absolute_error(y_pred=Y_test_hat, y_true=Y_test_sk[series][lag])
estimation_future_performance

3.0046182746521652

Now we are going to repeat the previous steps (the ones associated to the cross validation part) for the remaining two series, humidity and wind speed.

Humidity (`rh`)#

Cross Validation#

n_splits = 10
scores = {}
series = 'rh'

Computing inner score by K-Fold CV for statsmodels implementation

for name, model in zip(st_models[series].keys(), st_models[series].values()):
    print(name)

    scores[name] = KFold_score_time_series(estimator=model, 
                                          X=X_train_st[series], y=Y_train_st[series], 
                                          n_splits=n_splits, test_window=test_window, 
                                          scoring=mean_absolute_error)                                    

SARIMA(D=1, P=1, p=1, s=7)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(D=1, P=1, p=2, s=7)
Fold's size: 290. Train size: 275. Test size: 15

SARIMA(D=1, P=2, p=1, s=7)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(D=2, P=2, p=1, s=7)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(D=1, P=2, p=2, s=7)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(D=2, P=2, p=2, s=7)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(D=1, P=1, p=1, s=14)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(D=1, P=1, p=2, s=14)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=0.05)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=0.5)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=0.8)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=1.5)
Fold's size: 290. Train size: 275. Test size: 15

Computing inner score by K-Fold CV for sklearn implementation

for lag in lags_grid:
    print(lag)

    for name, model in zip(sk_models.keys(), sk_models.values()):
        print(name)

        scores[name + f' (lag={lag})'] = KFold_score_time_series(estimator=model, 
                                            X=X_train_sk[series][lag], y=Y_train_sk[series][lag], 
                                            n_splits=n_splits, test_window=test_window, 
                                            scoring=mean_absolute_error)                                    

1
Linear Regression
Fold's size: 290. Train size: 275. Test size: 15
knn
Fold's size: 290. Train size: 275. Test size: 15
2
Linear Regression
Fold's size: 290. Train size: 275. Test size: 15
knn
Fold's size: 290. Train size: 275. Test size: 15
3
Linear Regression
Fold's size: 290. Train size: 275. Test size: 15
knn
Fold's size: 290. Train size: 275. Test size: 15
7
Linear Regression
Fold's size: 289. Train size: 274. Test size: 15
knn
Fold's size: 289. Train size: 274. Test size: 15
10
Linear Regression
Fold's size: 289. Train size: 274. Test size: 15
knn
Fold's size: 289. Train size: 274. Test size: 15
20
Linear Regression
Fold's size: 288. Train size: 273. Test size: 15
knn
Fold's size: 288. Train size: 273. Test size: 15

30
Linear Regression
Fold's size: 287. Train size: 272. Test size: 15
knn
Fold's size: 287. Train size: 272. Test size: 15
40
Linear Regression
Fold's size: 286. Train size: 271. Test size: 15
knn
Fold's size: 286. Train size: 271. Test size: 15

Computing inner score by K-Fold CV for pdmarima implementation

name = 'Auto SARIMA'

auto_sarima = autoSARIMA(seasonal=True, m=7, d=0, D=1, start_p=0, start_q=0, max_p=3, max_q=3,
                         suppress_warnings=True, stepwise=True, trace=True)

scores[name] = KFold_score_time_series(estimator=auto_sarima, 
                                          X=X_train_st[series], y=Y_train_st[series], 
                                          n_splits=n_splits, test_window=test_window, 
                                          scoring=mean_absolute_error)                                    

Fold's size: 290. Train size: 275. Test size: 15
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.45 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=2119.118, Time=0.01 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1998.953, Time=0.16 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.33 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=2117.162, Time=0.02 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=2054.646, Time=0.04 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1972.934, Time=0.34 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.14 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.70 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=2045.514, Time=0.51 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1953.354, Time=0.39 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=1972.645, Time=0.17 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.03 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.72 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=1951.979, Time=0.59 sec
 ARIMA(3,0,0)(1,1,0)[7] intercept   : AIC=1973.270, Time=0.32 sec
 ARIMA(3,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.45 sec
 ARIMA(3,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.77 sec
 ARIMA(3,0,1)(2,1,0)[7] intercept   : AIC=1950.476, Time=1.00 sec
 ARIMA(3,0,1)(1,1,0)[7] intercept   : AIC=1974.784, Time=0.70 sec
 ARIMA(3,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.84 sec
 ARIMA(3,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.96 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=1952.876, Time=0.52 sec
 ARIMA(3,0,2)(2,1,0)[7] intercept   : AIC=1952.177, Time=1.25 sec
 ARIMA(2,0,2)(2,1,0)[7] intercept   : AIC=1954.654, Time=1.26 sec
 ARIMA(3,0,1)(2,1,0)[7]             : AIC=1948.481, Time=0.60 sec
 ARIMA(3,0,1)(1,1,0)[7]             : AIC=1972.802, Time=0.35 sec
 ARIMA(3,0,1)(2,1,1)[7]             : AIC=inf, Time=1.85 sec
 ARIMA(3,0,1)(1,1,1)[7]             : AIC=inf, Time=0.88 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=1950.898, Time=0.20 sec
 ARIMA(3,0,0)(2,1,0)[7]             : AIC=1949.999, Time=0.16 sec
 ARIMA(3,0,2)(2,1,0)[7]             : AIC=1950.193, Time=0.57 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=1951.378, Time=0.15 sec
 ARIMA(2,0,2)(2,1,0)[7]             : AIC=1952.675, Time=0.35 sec

Best model:  ARIMA(3,0,1)(2,1,0)[7]          
Total fit time: 21.817 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.38 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=2105.488, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1969.628, Time=0.15 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.35 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=2104.536, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=2016.840, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1922.823, Time=0.36 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=0.92 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.55 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=2027.524, Time=0.43 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1924.300, Time=0.56 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=1924.282, Time=0.54 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=1944.496, Time=0.32 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=1926.282, Time=0.68 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=1922.243, Time=0.12 sec
 ARIMA(1,0,0)(1,1,0)[7]             : AIC=1968.358, Time=0.05 sec
 ARIMA(1,0,0)(2,1,1)[7]             : AIC=1894.552, Time=0.36 sec
 ARIMA(1,0,0)(1,1,1)[7]             : AIC=1895.224, Time=0.20 sec
 ARIMA(1,0,0)(2,1,2)[7]             : AIC=1893.290, Time=0.37 sec
 ARIMA(1,0,0)(1,1,2)[7]             : AIC=1892.326, Time=0.35 sec
 ARIMA(1,0,0)(0,1,2)[7]             : AIC=1894.972, Time=0.33 sec
 ARIMA(1,0,0)(0,1,1)[7]             : AIC=1894.050, Time=0.14 sec
 ARIMA(0,0,0)(1,1,2)[7]             : AIC=inf, Time=0.36 sec
 ARIMA(2,0,0)(1,1,2)[7]             : AIC=1894.317, Time=0.57 sec
 ARIMA(1,0,1)(1,1,2)[7]             : AIC=1894.314, Time=0.61 sec
 ARIMA(0,0,1)(1,1,2)[7]             : AIC=1929.368, Time=0.40 sec
 ARIMA(2,0,1)(1,1,2)[7]             : AIC=1895.913, Time=1.40 sec
 ARIMA(1,0,0)(1,1,2)[7] intercept   : AIC=inf, Time=0.93 sec

Best model:  ARIMA(1,0,0)(1,1,2)[7]          
Total fit time: 11.460 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=2013.824, Time=0.14 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=2149.287, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1941.907, Time=0.16 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1916.089, Time=0.13 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=2147.659, Time=0.00 sec
 ARIMA(0,0,1)(0,1,0)[7] intercept   : AIC=2052.097, Time=0.05 sec
 ARIMA(0,0,1)(1,1,1)[7] intercept   : AIC=1913.282, Time=0.20 sec
 ARIMA(0,0,1)(1,1,0)[7] intercept   : AIC=1962.701, Time=0.15 sec
 ARIMA(0,0,1)(2,1,1)[7] intercept   : AIC=1913.607, Time=0.39 sec
 ARIMA(0,0,1)(1,1,2)[7] intercept   : AIC=1910.667, Time=0.70 sec
 ARIMA(0,0,1)(0,1,2)[7] intercept   : AIC=1912.229, Time=0.26 sec
 ARIMA(0,0,1)(2,1,2)[7] intercept   : AIC=1912.659, Time=1.55 sec
 ARIMA(0,0,0)(1,1,2)[7] intercept   : AIC=2010.285, Time=0.60 sec
 ARIMA(1,0,1)(1,1,2)[7] intercept   : AIC=1884.458, Time=1.19 sec
 ARIMA(1,0,1)(0,1,2)[7] intercept   : AIC=1882.698, Time=0.45 sec
 ARIMA(1,0,1)(0,1,1)[7] intercept   : AIC=1884.370, Time=0.26 sec
 ARIMA(1,0,1)(1,1,1)[7] intercept   : AIC=1883.136, Time=0.39 sec
 ARIMA(1,0,0)(0,1,2)[7] intercept   : AIC=1882.634, Time=0.27 sec
 ARIMA(1,0,0)(0,1,1)[7] intercept   : AIC=1884.240, Time=0.15 sec
 ARIMA(1,0,0)(1,1,2)[7] intercept   : AIC=1884.544, Time=0.68 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=1882.979, Time=0.25 sec
 ARIMA(0,0,0)(0,1,2)[7] intercept   : AIC=2012.987, Time=0.23 sec
 ARIMA(2,0,0)(0,1,2)[7] intercept   : AIC=1882.983, Time=0.43 sec
 ARIMA(2,0,1)(0,1,2)[7] intercept   : AIC=1884.586, Time=0.67 sec
 ARIMA(1,0,0)(0,1,2)[7]             : AIC=1882.877, Time=0.25 sec

Best model:  ARIMA(1,0,0)(0,1,2)[7] intercept
Total fit time: 9.523 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1920.671, Time=0.18 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=2008.499, Time=0.02 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1899.941, Time=0.22 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1862.729, Time=0.18 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=2007.322, Time=0.00 sec
 ARIMA(0,0,1)(0,1,0)[7] intercept   : AIC=1969.020, Time=0.07 sec
 ARIMA(0,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.46 sec
 ARIMA(0,0,1)(0,1,2)[7] intercept   : AIC=inf, Time=0.62 sec
 ARIMA(0,0,1)(1,1,0)[7] intercept   : AIC=1912.076, Time=0.09 sec
 ARIMA(0,0,1)(1,1,2)[7] intercept   : AIC=inf, Time=0.96 sec
 ARIMA(0,0,0)(0,1,1)[7] intercept   : AIC=1921.295, Time=0.05 sec
 ARIMA(1,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.43 sec
 ARIMA(0,0,2)(0,1,1)[7] intercept   : AIC=inf, Time=0.22 sec
 ARIMA(1,0,0)(0,1,1)[7] intercept   : AIC=inf, Time=0.30 sec
 ARIMA(1,0,2)(0,1,1)[7] intercept   : AIC=inf, Time=0.78 sec
 ARIMA(0,0,1)(0,1,1)[7]             : AIC=1868.276, Time=0.09 sec

Best model:  ARIMA(0,0,1)(0,1,1)[7] intercept
Total fit time: 4.672 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.36 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=2128.867, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1988.248, Time=0.15 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1943.306, Time=0.23 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=2126.939, Time=0.02 sec
 ARIMA(0,0,1)(0,1,0)[7] intercept   : AIC=2042.381, Time=0.05 sec
 ARIMA(0,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.43 sec
 ARIMA(0,0,1)(0,1,2)[7] intercept   : AIC=inf, Time=0.83 sec
 ARIMA(0,0,1)(1,1,0)[7] intercept   : AIC=1999.536, Time=0.16 sec
 ARIMA(0,0,1)(1,1,2)[7] intercept   : AIC=inf, Time=1.05 sec
 ARIMA(0,0,0)(0,1,1)[7] intercept   : AIC=2044.086, Time=0.07 sec
 ARIMA(1,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.45 sec
 ARIMA(0,0,2)(0,1,1)[7] intercept   : AIC=1933.229, Time=0.32 sec
 ARIMA(0,0,2)(0,1,0)[7] intercept   : AIC=2034.956, Time=0.10 sec
 ARIMA(0,0,2)(1,1,1)[7] intercept   : AIC=inf, Time=0.66 sec
 ARIMA(0,0,2)(0,1,2)[7] intercept   : AIC=inf, Time=0.76 sec
 ARIMA(0,0,2)(1,1,0)[7] intercept   : AIC=1993.651, Time=0.23 sec
 ARIMA(0,0,2)(1,1,2)[7] intercept   : AIC=inf, Time=1.70 sec
 ARIMA(1,0,2)(0,1,1)[7] intercept   : AIC=inf, Time=0.82 sec
 ARIMA(0,0,3)(0,1,1)[7] intercept   : AIC=inf, Time=0.67 sec
 ARIMA(1,0,3)(0,1,1)[7] intercept   : AIC=inf, Time=0.72 sec
 ARIMA(0,0,2)(0,1,1)[7]             : AIC=1937.187, Time=0.17 sec

Best model:  ARIMA(0,0,2)(0,1,1)[7] intercept
Total fit time: 9.946 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.33 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=2149.771, Time=0.02 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1994.208, Time=0.10 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.35 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=2147.786, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=2058.426, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1956.254, Time=0.32 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=0.83 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.44 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=2059.583, Time=0.53 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1958.173, Time=0.42 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=1958.178, Time=0.43 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=1980.582, Time=0.39 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=inf, Time=1.49 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=1954.255, Time=0.13 sec
 ARIMA(1,0,0)(1,1,0)[7]             : AIC=1992.209, Time=0.05 sec
 ARIMA(1,0,0)(2,1,1)[7]             : AIC=1913.177, Time=0.37 sec
 ARIMA(1,0,0)(1,1,1)[7]             : AIC=inf, Time=0.18 sec
 ARIMA(1,0,0)(2,1,2)[7]             : AIC=1911.559, Time=0.88 sec
 ARIMA(1,0,0)(1,1,2)[7]             : AIC=inf, Time=0.67 sec
 ARIMA(0,0,0)(2,1,2)[7]             : AIC=2040.824, Time=0.36 sec
 ARIMA(2,0,0)(2,1,2)[7]             : AIC=1913.544, Time=1.09 sec
 ARIMA(1,0,1)(2,1,2)[7]             : AIC=1913.543, Time=1.32 sec
 ARIMA(0,0,1)(2,1,2)[7]             : AIC=1950.004, Time=0.75 sec
 ARIMA(2,0,1)(2,1,2)[7]             : AIC=inf, Time=1.03 sec
 ARIMA(1,0,0)(2,1,2)[7] intercept   : AIC=inf, Time=1.24 sec

Best model:  ARIMA(1,0,0)(2,1,2)[7]          
Total fit time: 13.735 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.39 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=2163.372, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1994.617, Time=0.16 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.33 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=2161.378, Time=0.01 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=2080.776, Time=0.03 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=1974.911, Time=0.40 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.01 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.84 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=2088.179, Time=0.61 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=1976.646, Time=0.42 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=1976.289, Time=0.40 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=1988.368, Time=0.29 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=1972.927, Time=0.85 sec
 ARIMA(2,0,1)(1,1,0)[7] intercept   : AIC=1992.715, Time=0.32 sec
 ARIMA(2,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.40 sec
 ARIMA(2,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.79 sec
 ARIMA(3,0,1)(2,1,0)[7] intercept   : AIC=1970.683, Time=1.00 sec
 ARIMA(3,0,1)(1,1,0)[7] intercept   : AIC=1991.054, Time=0.48 sec
 ARIMA(3,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.55 sec
 ARIMA(3,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.88 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=1970.810, Time=0.58 sec
 ARIMA(3,0,2)(2,1,0)[7] intercept   : AIC=1972.494, Time=1.25 sec
 ARIMA(2,0,2)(2,1,0)[7] intercept   : AIC=1970.570, Time=0.77 sec
 ARIMA(2,0,2)(1,1,0)[7] intercept   : AIC=1991.521, Time=0.26 sec
 ARIMA(2,0,2)(2,1,1)[7] intercept   : AIC=inf, Time=1.83 sec
 ARIMA(2,0,2)(1,1,1)[7] intercept   : AIC=inf, Time=0.89 sec
 ARIMA(1,0,2)(2,1,0)[7] intercept   : AIC=1970.487, Time=0.53 sec
 ARIMA(1,0,2)(1,1,0)[7] intercept   : AIC=1991.308, Time=0.28 sec
 ARIMA(1,0,2)(2,1,1)[7] intercept   : AIC=inf, Time=1.47 sec
 ARIMA(1,0,2)(1,1,1)[7] intercept   : AIC=inf, Time=0.88 sec
 ARIMA(0,0,2)(2,1,0)[7] intercept   : AIC=1984.433, Time=0.45 sec
 ARIMA(1,0,3)(2,1,0)[7] intercept   : AIC=1970.592, Time=0.70 sec
 ARIMA(0,0,3)(2,1,0)[7] intercept   : AIC=1975.491, Time=0.50 sec
 ARIMA(2,0,3)(2,1,0)[7] intercept   : AIC=1973.266, Time=1.92 sec
 ARIMA(1,0,2)(2,1,0)[7]             : AIC=1968.561, Time=0.24 sec
 ARIMA(1,0,2)(1,1,0)[7]             : AIC=1989.374, Time=0.13 sec
 ARIMA(1,0,2)(2,1,1)[7]             : AIC=inf, Time=1.11 sec
 ARIMA(1,0,2)(1,1,1)[7]             : AIC=inf, Time=0.75 sec
 ARIMA(0,0,2)(2,1,0)[7]             : AIC=1982.615, Time=0.14 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=1974.414, Time=0.15 sec
 ARIMA(2,0,2)(2,1,0)[7]             : AIC=1968.651, Time=0.32 sec
 ARIMA(1,0,3)(2,1,0)[7]             : AIC=1968.676, Time=0.28 sec
 ARIMA(0,0,1)(2,1,0)[7]             : AIC=1986.576, Time=0.12 sec
 ARIMA(0,0,3)(2,1,0)[7]             : AIC=1973.647, Time=0.19 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=1971.050, Time=0.29 sec
 ARIMA(2,0,3)(2,1,0)[7]             : AIC=1970.628, Time=1.05 sec

Best model:  ARIMA(1,0,2)(2,1,0)[7]          
Total fit time: 29.268 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=1977.134, Time=0.14 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=2060.548, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1929.768, Time=0.18 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1893.918, Time=0.13 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=2058.724, Time=0.00 sec
 ARIMA(0,0,1)(0,1,0)[7] intercept   : AIC=2004.933, Time=0.03 sec
 ARIMA(0,0,1)(1,1,1)[7] intercept   : AIC=1895.915, Time=0.19 sec
 ARIMA(0,0,1)(0,1,2)[7] intercept   : AIC=1895.914, Time=0.32 sec
 ARIMA(0,0,1)(1,1,0)[7] intercept   : AIC=1943.975, Time=0.14 sec
 ARIMA(0,0,1)(1,1,2)[7] intercept   : AIC=1897.917, Time=0.40 sec
 ARIMA(0,0,0)(0,1,1)[7] intercept   : AIC=1975.839, Time=0.09 sec
 ARIMA(1,0,1)(0,1,1)[7] intercept   : AIC=1862.088, Time=0.32 sec
 ARIMA(1,0,1)(0,1,0)[7] intercept   : AIC=1997.192, Time=0.05 sec
 ARIMA(1,0,1)(1,1,1)[7] intercept   : AIC=1863.818, Time=0.32 sec
 ARIMA(1,0,1)(0,1,2)[7] intercept   : AIC=1863.792, Time=0.38 sec
 ARIMA(1,0,1)(1,1,0)[7] intercept   : AIC=1931.687, Time=0.22 sec
 ARIMA(1,0,1)(1,1,2)[7] intercept   : AIC=1865.916, Time=0.79 sec
 ARIMA(1,0,0)(0,1,1)[7] intercept   : AIC=1861.200, Time=0.10 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=1995.237, Time=0.03 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=1862.985, Time=0.30 sec
 ARIMA(1,0,0)(0,1,2)[7] intercept   : AIC=1862.968, Time=0.32 sec
 ARIMA(1,0,0)(1,1,2)[7] intercept   : AIC=1865.013, Time=0.66 sec
 ARIMA(2,0,0)(0,1,1)[7] intercept   : AIC=1862.482, Time=0.28 sec
 ARIMA(2,0,1)(0,1,1)[7] intercept   : AIC=1865.176, Time=0.42 sec
 ARIMA(1,0,0)(0,1,1)[7]             : AIC=1862.362, Time=0.12 sec

Best model:  ARIMA(1,0,0)(0,1,1)[7] intercept
Total fit time: 5.914 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.34 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=2086.345, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1950.806, Time=0.12 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1889.841, Time=0.20 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=2084.629, Time=0.00 sec
 ARIMA(0,0,1)(0,1,0)[7] intercept   : AIC=2030.793, Time=0.03 sec
 ARIMA(0,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.56 sec
 ARIMA(0,0,1)(0,1,2)[7] intercept   : AIC=inf, Time=0.87 sec
 ARIMA(0,0,1)(1,1,0)[7] intercept   : AIC=1963.565, Time=0.14 sec
 ARIMA(0,0,1)(1,1,2)[7] intercept   : AIC=inf, Time=1.46 sec
 ARIMA(0,0,0)(0,1,1)[7] intercept   : AIC=1976.562, Time=0.10 sec
 ARIMA(1,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.57 sec
 ARIMA(0,0,2)(0,1,1)[7] intercept   : AIC=inf, Time=0.54 sec
 ARIMA(1,0,0)(0,1,1)[7] intercept   : AIC=inf, Time=0.42 sec
 ARIMA(1,0,2)(0,1,1)[7] intercept   : AIC=inf, Time=0.83 sec
 ARIMA(0,0,1)(0,1,1)[7]             : AIC=1900.083, Time=0.08 sec

Best model:  ARIMA(0,0,1)(0,1,1)[7] intercept
Total fit time: 6.250 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=2038.415, Time=0.17 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=2118.185, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=1963.384, Time=0.15 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=1951.652, Time=0.13 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=2116.231, Time=0.00 sec
 ARIMA(0,0,1)(0,1,0)[7] intercept   : AIC=2056.456, Time=0.03 sec
 ARIMA(0,0,1)(1,1,1)[7] intercept   : AIC=1953.624, Time=0.22 sec
 ARIMA(0,0,1)(0,1,2)[7] intercept   : AIC=1953.634, Time=0.30 sec
 ARIMA(0,0,1)(1,1,0)[7] intercept   : AIC=1979.875, Time=0.15 sec
 ARIMA(0,0,1)(1,1,2)[7] intercept   : AIC=1955.233, Time=0.50 sec
 ARIMA(0,0,0)(0,1,1)[7] intercept   : AIC=2036.766, Time=0.09 sec
 ARIMA(1,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.44 sec
 ARIMA(0,0,2)(0,1,1)[7] intercept   : AIC=1925.834, Time=0.21 sec
 ARIMA(0,0,2)(0,1,0)[7] intercept   : AIC=2038.309, Time=0.10 sec
 ARIMA(0,0,2)(1,1,1)[7] intercept   : AIC=inf, Time=0.63 sec
 ARIMA(0,0,2)(0,1,2)[7] intercept   : AIC=1927.111, Time=0.67 sec
 ARIMA(0,0,2)(1,1,0)[7] intercept   : AIC=1968.398, Time=0.15 sec
 ARIMA(0,0,2)(1,1,2)[7] intercept   : AIC=inf, Time=1.60 sec
 ARIMA(1,0,2)(0,1,1)[7] intercept   : AIC=inf, Time=0.87 sec
 ARIMA(0,0,3)(0,1,1)[7] intercept   : AIC=1922.457, Time=0.40 sec
 ARIMA(0,0,3)(0,1,0)[7] intercept   : AIC=2039.618, Time=0.14 sec
 ARIMA(0,0,3)(1,1,1)[7] intercept   : AIC=inf, Time=0.87 sec
 ARIMA(0,0,3)(0,1,2)[7] intercept   : AIC=inf, Time=1.13 sec
 ARIMA(0,0,3)(1,1,0)[7] intercept   : AIC=1969.310, Time=0.25 sec
 ARIMA(0,0,3)(1,1,2)[7] intercept   : AIC=1924.801, Time=2.02 sec
 ARIMA(1,0,3)(0,1,1)[7] intercept   : AIC=inf, Time=1.07 sec
 ARIMA(0,0,3)(0,1,1)[7]             : AIC=1921.397, Time=0.44 sec
 ARIMA(0,0,3)(0,1,0)[7]             : AIC=2037.628, Time=0.08 sec
 ARIMA(0,0,3)(1,1,1)[7]             : AIC=1922.965, Time=0.37 sec
 ARIMA(0,0,3)(0,1,2)[7]             : AIC=1923.037, Time=0.60 sec
 ARIMA(0,0,3)(1,1,0)[7]             : AIC=1967.312, Time=0.12 sec
 ARIMA(0,0,3)(1,1,2)[7]             : AIC=inf, Time=1.60 sec
 ARIMA(0,0,2)(0,1,1)[7]             : AIC=1924.592, Time=0.18 sec
 ARIMA(1,0,3)(0,1,1)[7]             : AIC=inf, Time=0.73 sec
 ARIMA(1,0,2)(0,1,1)[7]             : AIC=inf, Time=0.64 sec

Best model:  ARIMA(0,0,3)(0,1,1)[7]          
Total fit time: 17.101 seconds

Selecting the best model#

Given the previous results, in this section we get the best model, that is, the one with the least inner error.

model_names = np.array(list(scores.keys()))
inner_scores_values = np.array(list(scores.values()))
best_model_CV = model_names[np.argmin(inner_scores_values)]

plt.figure(figsize=(7, 11))
ax = sns.scatterplot(x=inner_scores_values, y=model_names, color='blue', s=95)
ax = sns.scatterplot(x=np.min(inner_scores_values), 
                     y=[best_model_CV], color='red', s=95)
plt.title(f'Model Selection - Daily Humidity (%) \n\n {n_splits} Fold CV - Test window {test_window} days', size=15, weight='bold')
ax.set_ylabel('Models', size=13)
ax.set_xlabel('MAE', size=11)
min = np.min(inner_scores_values)
max = np.max(inner_scores_values)
plt.xticks(np.round(np.linspace(min,max, 5), 3), fontsize=10)
plt.yticks(fontsize=12)
plt.show()
print('The best model according to cross validation is', best_model_CV)

_images/aa809a00be2168654d6ba93e9d36829e46a2369affb949be11d4b9f3a93837ec.png

The best model according to cross validation is Linear Regression (lag=40)

According to cross validation with a test window fo 15 days, the best model for forecasting the humidity in Jena is the Linear Regression algorithm with 40 lags.

We can see again that the lags have a crucial influence in the forecasting results.

Predictive visualization#

In this section we are going to plot the K-Fold cross validation process applied above for some of the models, just for getting more insight of how it works, and how dependent are the forecast on the train-validate partitions (namely, on the folds).

SARIMA \((p=1, d=0, q=0)\times(P=1, D=1, Q=0)_{s=7}\)

estimator = sarima[series][1]
model_name = str(sarima[series][1])

KFold_time_series_plot(n_cols=3, figsize=(20,15), estimator=estimator, 
                       X=X_train_st[series], y=Y_train_st[series], n_splits=n_splits, test_window=test_window, score=mean_absolute_error, dates=dates,
                       true_color='blue', pred_color='red', marker='', markersize=4,
                       title=f"Forecasting Daily Avg. Humidity (%) - Forecast window of {forecast_window} days - {n_splits} Fold CV - MAE = {np.round(scores[model_name],3)}\n\n{model_name}",
                       hspace=0.35, wspace=0.1, subtitles_size=12, title_size=15, xticks_rotation=25, 
                       bbox_to_anchor=(0.5,0.05))

Fold's size: 290. Train size: 275. Test size: 15

_images/4a3978134db27f4e02e3346c55f6ea78b8ea5fa0d4fdf2d1eed0bb769c01487c.png

Linear regression with 40 lags

estimator = linear_regression
lag = 40
model_name = f'Linear Regression (lag={lag})'

KFold_time_series_plot(n_cols=3, figsize=(20,15), estimator=estimator,  score=mean_absolute_error,
                       X=X_train_sk[series][lag], y=Y_train_sk[series][lag], n_splits=n_splits, test_window=test_window, dates=dates,
                       true_color='blue', pred_color='red', marker='', markersize=4,
                       title=f"Forecasting Daily Avg. Humidity (%) - Forecast window of {forecast_window} days - {n_splits} Fold CV - MAE = {np.round(scores[model_name],3)}\n\n{model_name}",
                       hspace=0.35, wspace=0.1, subtitles_size=12, title_size=15, xticks_rotation=25, 
                       bbox_to_anchor=(0.5,0.05))

Fold's size: 286. Train size: 271. Test size: 15

_images/dafd05141326a670ce248edc642dc15c5a2fb86eff800886d9972c23380607e1.png

Estimation of future performance#

We estimate the future performance of the best model according to KFold Cross Validation.

best_model_CV

'Linear Regression (lag=40)'

lag = 40
linear_regression.fit(X=X_train_sk[series][lag], y=Y_train_sk[series][lag])
Y_test_hat = linear_regression.forecast(window=test_window)
estimation_future_performance = mean_absolute_error(y_pred=Y_test_hat, y_true=Y_test_sk[series][lag])
estimation_future_performance

6.438584620069256

Wind Speed (`wv`)#

Cross Validation#

n_splits = 10
scores = {}
series = 'wv'

Computing inner score by K-Fold CV for statsmodels implementation

for name, model in zip(st_models[series].keys(), st_models[series].values()):
    print(name)

    scores[name] = KFold_score_time_series(estimator=model, 
                                          X=X_train_st[series], y=Y_train_st[series], 
                                          n_splits=n_splits, test_window=test_window, 
                                          scoring=mean_absolute_error)                                    

SARIMA(p=1)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(p=2)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(p=1, q=1)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(p=1, q=2)
Fold's size: 290. Train size: 275. Test size: 15

SARIMA(p=2, q=1)
Fold's size: 290. Train size: 275. Test size: 15
SARIMA(p=2, q=2)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=0.05)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=0.5)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=0.8)
Fold's size: 290. Train size: 275. Test size: 15
SimpleExpSmooth(smoothing_level=1.5)
Fold's size: 290. Train size: 275. Test size: 15

Computing inner score by K-Fold CV for sklearn implementation

for lag in lags_grid:
    print(lag)

    for name, model in zip(sk_models.keys(), sk_models.values()):
        print(name)

        scores[name + f' (lag={lag})'] = KFold_score_time_series(estimator=model, 
                                            X=X_train_sk[series][lag], y=Y_train_sk[series][lag], 
                                            n_splits=n_splits, test_window=test_window, 
                                            scoring=mean_absolute_error)                                    

1
Linear Regression
Fold's size: 290. Train size: 275. Test size: 15
knn
Fold's size: 290. Train size: 275. Test size: 15
2
Linear Regression
Fold's size: 290. Train size: 275. Test size: 15
knn
Fold's size: 290. Train size: 275. Test size: 15
3
Linear Regression
Fold's size: 290. Train size: 275. Test size: 15
knn
Fold's size: 290. Train size: 275. Test size: 15
7
Linear Regression
Fold's size: 289. Train size: 274. Test size: 15
knn
Fold's size: 289. Train size: 274. Test size: 15
10
Linear Regression
Fold's size: 289. Train size: 274. Test size: 15
knn
Fold's size: 289. Train size: 274. Test size: 15
20
Linear Regression
Fold's size: 288. Train size: 273. Test size: 15

knn
Fold's size: 288. Train size: 273. Test size: 15
30
Linear Regression
Fold's size: 287. Train size: 272. Test size: 15
knn
Fold's size: 287. Train size: 272. Test size: 15
40
Linear Regression
Fold's size: 286. Train size: 271. Test size: 15
knn
Fold's size: 286. Train size: 271. Test size: 15

Computing inner score by K-Fold CV for pdmarima implementation

name = 'Auto SARIMA'

auto_sarima = autoSARIMA(seasonal=True, m=7, d=0, D=1, start_p=0, start_q=0, max_p=3, max_q=3,
                         suppress_warnings=True, stepwise=True, trace=True)

scores[name] = KFold_score_time_series(estimator=auto_sarima, 
                                          X=X_train_st[series], y=Y_train_st[series], 
                                          n_splits=n_splits, test_window=test_window, 
                                          scoring=mean_absolute_error)                                    

Fold's size: 290. Train size: 275. Test size: 15
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.28 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=815.877, Time=0.01 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=697.372, Time=0.13 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.21 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=813.894, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=763.906, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=678.813, Time=0.31 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=0.72 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.27 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=726.068, Time=0.28 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=676.668, Time=0.29 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=693.946, Time=0.15 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=0.73 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.43 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=677.502, Time=0.34 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=677.559, Time=0.52 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=675.713, Time=0.32 sec
 ARIMA(1,0,1)(1,1,0)[7] intercept   : AIC=691.406, Time=0.14 sec
 ARIMA(1,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=0.81 sec
 ARIMA(1,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.46 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=674.638, Time=0.18 sec
 ARIMA(0,0,1)(1,1,0)[7] intercept   : AIC=689.527, Time=0.10 sec
 ARIMA(0,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=0.50 sec
 ARIMA(0,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.33 sec
 ARIMA(0,0,2)(2,1,0)[7] intercept   : AIC=675.807, Time=0.23 sec
 ARIMA(1,0,2)(2,1,0)[7] intercept   : AIC=677.535, Time=0.87 sec
 ARIMA(0,0,1)(2,1,0)[7]             : AIC=672.662, Time=0.11 sec
 ARIMA(0,0,1)(1,1,0)[7]             : AIC=687.543, Time=0.05 sec
 ARIMA(0,0,1)(2,1,1)[7]             : AIC=inf, Time=0.37 sec
 ARIMA(0,0,1)(1,1,1)[7]             : AIC=inf, Time=0.20 sec
 ARIMA(0,0,0)(2,1,0)[7]             : AIC=724.081, Time=0.05 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=673.740, Time=0.13 sec
 ARIMA(0,0,2)(2,1,0)[7]             : AIC=673.833, Time=0.10 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=676.846, Time=0.11 sec
 ARIMA(1,0,2)(2,1,0)[7]             : AIC=675.564, Time=0.45 sec

Best model:  ARIMA(0,0,1)(2,1,0)[7]          
Total fit time: 10.222 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.43 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=936.839, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=819.712, Time=0.08 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.22 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=934.863, Time=0.02 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=904.989, Time=0.00 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=807.182, Time=0.22 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.28 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.48 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=847.684, Time=0.15 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=807.577, Time=0.23 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=807.620, Time=0.24 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=807.660, Time=0.15 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=809.569, Time=0.52 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=805.186, Time=0.08 sec
 ARIMA(1,0,0)(1,1,0)[7]             : AIC=817.727, Time=0.04 sec
 ARIMA(1,0,0)(2,1,1)[7]             : AIC=inf, Time=0.66 sec
 ARIMA(1,0,0)(1,1,1)[7]             : AIC=inf, Time=0.17 sec
 ARIMA(0,0,0)(2,1,0)[7]             : AIC=845.685, Time=0.05 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=805.580, Time=0.10 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=805.624, Time=0.11 sec
 ARIMA(0,0,1)(2,1,0)[7]             : AIC=805.663, Time=0.10 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=807.572, Time=0.23 sec

Best model:  ARIMA(1,0,0)(2,1,0)[7]          
Total fit time: 5.566 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.45 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=933.805, Time=0.02 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=828.911, Time=0.08 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.44 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=931.865, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=879.426, Time=0.03 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=795.084, Time=0.17 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=0.88 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.57 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=855.098, Time=0.28 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=790.186, Time=0.22 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=829.584, Time=0.11 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.66 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.58 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=790.395, Time=0.27 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=inf, Time=1.75 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=792.312, Time=0.31 sec
 ARIMA(3,0,1)(2,1,0)[7] intercept   : AIC=792.111, Time=0.83 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=788.262, Time=0.14 sec
 ARIMA(2,0,0)(1,1,0)[7]             : AIC=827.622, Time=0.05 sec
 ARIMA(2,0,0)(2,1,1)[7]             : AIC=inf, Time=1.12 sec
 ARIMA(2,0,0)(1,1,1)[7]             : AIC=inf, Time=0.46 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=793.133, Time=0.08 sec
 ARIMA(3,0,0)(2,1,0)[7]             : AIC=788.486, Time=0.11 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=inf, Time=1.01 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=790.375, Time=0.09 sec
 ARIMA(3,0,1)(2,1,0)[7]             : AIC=790.211, Time=0.47 sec

Best model:  ARIMA(2,0,0)(2,1,0)[7]          
Total fit time: 12.213 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.30 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=929.920, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=808.106, Time=0.08 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.37 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=927.924, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=868.516, Time=0.03 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=774.260, Time=0.24 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.20 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.69 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=850.232, Time=0.30 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=775.736, Time=0.30 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=775.724, Time=0.29 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=787.038, Time=0.21 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=777.722, Time=0.47 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=772.260, Time=0.07 sec
 ARIMA(1,0,0)(1,1,0)[7]             : AIC=806.107, Time=0.05 sec
 ARIMA(1,0,0)(2,1,1)[7]             : AIC=inf, Time=0.32 sec
 ARIMA(1,0,0)(1,1,1)[7]             : AIC=inf, Time=0.20 sec
 ARIMA(0,0,0)(2,1,0)[7]             : AIC=848.232, Time=0.05 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=773.736, Time=0.09 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=773.724, Time=0.12 sec
 ARIMA(0,0,1)(2,1,0)[7]             : AIC=785.039, Time=0.07 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=775.722, Time=0.20 sec

Best model:  ARIMA(1,0,0)(2,1,0)[7]          
Total fit time: 5.650 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.46 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=796.942, Time=0.02 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=686.577, Time=0.11 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.48 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=794.986, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=757.072, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=662.798, Time=0.22 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.04 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.60 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=696.745, Time=0.25 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=661.960, Time=0.33 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=687.191, Time=0.13 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.59 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.80 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=663.850, Time=0.33 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=663.911, Time=0.50 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=662.127, Time=0.31 sec
 ARIMA(3,0,1)(2,1,0)[7] intercept   : AIC=665.723, Time=0.75 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=660.123, Time=0.12 sec
 ARIMA(2,0,0)(1,1,0)[7]             : AIC=685.272, Time=0.06 sec
 ARIMA(2,0,0)(2,1,1)[7]             : AIC=inf, Time=0.75 sec
 ARIMA(2,0,0)(1,1,1)[7]             : AIC=inf, Time=0.66 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=660.948, Time=0.08 sec
 ARIMA(3,0,0)(2,1,0)[7]             : AIC=662.008, Time=0.15 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=662.071, Time=0.22 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=660.283, Time=0.16 sec
 ARIMA(3,0,1)(2,1,0)[7]             : AIC=663.875, Time=0.37 sec

Best model:  ARIMA(2,0,0)(2,1,0)[7]          
Total fit time: 10.516 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.39 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=885.632, Time=0.01 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=749.853, Time=0.10 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.65 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=884.211, Time=0.02 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=793.589, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=716.133, Time=0.22 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.11 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.64 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=810.719, Time=0.33 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=716.517, Time=0.30 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=716.817, Time=0.32 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=732.024, Time=0.20 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=inf, Time=1.19 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=714.371, Time=0.12 sec
 ARIMA(1,0,0)(1,1,0)[7]             : AIC=748.119, Time=0.05 sec
 ARIMA(1,0,0)(2,1,1)[7]             : AIC=663.125, Time=0.35 sec
 ARIMA(1,0,0)(1,1,1)[7]             : AIC=inf, Time=0.18 sec
 ARIMA(1,0,0)(2,1,2)[7]             : AIC=665.125, Time=0.53 sec
 ARIMA(1,0,0)(1,1,2)[7]             : AIC=inf, Time=0.46 sec
 ARIMA(0,0,0)(2,1,1)[7]             : AIC=753.947, Time=0.18 sec
 ARIMA(2,0,0)(2,1,1)[7]             : AIC=665.058, Time=0.39 sec
 ARIMA(1,0,1)(2,1,1)[7]             : AIC=665.061, Time=0.42 sec
 ARIMA(0,0,1)(2,1,1)[7]             : AIC=679.937, Time=0.28 sec
 ARIMA(2,0,1)(2,1,1)[7]             : AIC=666.693, Time=0.92 sec

Best model:  ARIMA(1,0,0)(2,1,1)[7]          
Total fit time: 9.370 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.43 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=895.395, Time=0.01 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=790.881, Time=0.08 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.36 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=893.396, Time=0.00 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=840.900, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=778.172, Time=0.20 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.11 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.73 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=852.297, Time=0.26 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=780.088, Time=0.30 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=780.068, Time=0.36 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=791.831, Time=0.21 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=781.833, Time=0.95 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=776.178, Time=0.10 sec
 ARIMA(1,0,0)(1,1,0)[7]             : AIC=788.881, Time=0.07 sec
 ARIMA(1,0,0)(2,1,1)[7]             : AIC=inf, Time=0.61 sec
 ARIMA(1,0,0)(1,1,1)[7]             : AIC=inf, Time=0.40 sec
 ARIMA(0,0,0)(2,1,0)[7]             : AIC=850.331, Time=0.05 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=778.094, Time=0.12 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=778.073, Time=0.13 sec
 ARIMA(0,0,1)(2,1,0)[7]             : AIC=789.847, Time=0.08 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=779.839, Time=0.37 sec

Best model:  ARIMA(1,0,0)(2,1,0)[7]          
Total fit time: 6.947 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.37 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=1002.354, Time=0.02 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=864.975, Time=0.09 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.38 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=1000.398, Time=0.02 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=923.868, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=822.141, Time=0.17 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.20 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.68 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=881.561, Time=0.20 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=822.543, Time=0.25 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=823.155, Time=0.28 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=839.758, Time=0.22 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=823.080, Time=0.52 sec
 ARIMA(1,0,0)(2,1,0)[7]             : AIC=820.169, Time=0.08 sec
 ARIMA(1,0,0)(1,1,0)[7]             : AIC=863.006, Time=0.05 sec
 ARIMA(1,0,0)(2,1,1)[7]             : AIC=inf, Time=0.67 sec
 ARIMA(1,0,0)(1,1,1)[7]             : AIC=inf, Time=0.40 sec
 ARIMA(0,0,0)(2,1,0)[7]             : AIC=879.619, Time=0.07 sec
 ARIMA(2,0,0)(2,1,0)[7]             : AIC=820.569, Time=0.13 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=821.181, Time=0.12 sec
 ARIMA(0,0,1)(2,1,0)[7]             : AIC=837.799, Time=0.11 sec
 ARIMA(2,0,1)(2,1,0)[7]             : AIC=821.105, Time=0.22 sec

Best model:  ARIMA(1,0,0)(2,1,0)[7]          
Total fit time: 6.252 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.43 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=977.869, Time=0.00 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=860.897, Time=0.10 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=inf, Time=0.49 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=976.092, Time=0.01 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=917.567, Time=0.02 sec
 ARIMA(1,0,0)(2,1,0)[7] intercept   : AIC=839.906, Time=0.15 sec
 ARIMA(1,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=0.87 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.55 sec
 ARIMA(0,0,0)(2,1,0)[7] intercept   : AIC=899.401, Time=0.23 sec
 ARIMA(2,0,0)(2,1,0)[7] intercept   : AIC=837.229, Time=0.23 sec
 ARIMA(2,0,0)(1,1,0)[7] intercept   : AIC=856.487, Time=0.10 sec
 ARIMA(2,0,0)(2,1,1)[7] intercept   : AIC=inf, Time=1.37 sec
 ARIMA(2,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.65 sec
 ARIMA(3,0,0)(2,1,0)[7] intercept   : AIC=838.816, Time=0.32 sec
 ARIMA(2,0,1)(2,1,0)[7] intercept   : AIC=839.007, Time=0.48 sec
 ARIMA(1,0,1)(2,1,0)[7] intercept   : AIC=837.164, Time=0.30 sec
 ARIMA(1,0,1)(1,1,0)[7] intercept   : AIC=857.148, Time=0.17 sec
 ARIMA(1,0,1)(2,1,1)[7] intercept   : AIC=inf, Time=1.67 sec
 ARIMA(1,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.64 sec
 ARIMA(0,0,1)(2,1,0)[7] intercept   : AIC=839.294, Time=0.20 sec
 ARIMA(1,0,2)(2,1,0)[7] intercept   : AIC=838.385, Time=0.72 sec
 ARIMA(0,0,2)(2,1,0)[7] intercept   : AIC=836.696, Time=0.23 sec
 ARIMA(0,0,2)(1,1,0)[7] intercept   : AIC=856.097, Time=0.10 sec
 ARIMA(0,0,2)(2,1,1)[7] intercept   : AIC=inf, Time=0.92 sec
 ARIMA(0,0,2)(1,1,1)[7] intercept   : AIC=inf, Time=0.77 sec
 ARIMA(0,0,3)(2,1,0)[7] intercept   : AIC=838.598, Time=0.25 sec
 ARIMA(1,0,3)(2,1,0)[7] intercept   : AIC=839.745, Time=1.07 sec
 ARIMA(0,0,2)(2,1,0)[7]             : AIC=835.006, Time=0.11 sec
 ARIMA(0,0,2)(1,1,0)[7]             : AIC=854.291, Time=0.07 sec
 ARIMA(0,0,2)(2,1,1)[7]             : AIC=inf, Time=0.62 sec
 ARIMA(0,0,2)(1,1,1)[7]             : AIC=inf, Time=0.33 sec
 ARIMA(0,0,1)(2,1,0)[7]             : AIC=837.717, Time=0.09 sec
 ARIMA(1,0,2)(2,1,0)[7]             : AIC=836.743, Time=0.34 sec
 ARIMA(0,0,3)(2,1,0)[7]             : AIC=836.921, Time=0.09 sec
 ARIMA(1,0,1)(2,1,0)[7]             : AIC=835.458, Time=0.13 sec
 ARIMA(1,0,3)(2,1,0)[7]             : AIC=838.068, Time=0.45 sec

Best model:  ARIMA(0,0,2)(2,1,0)[7]          
Total fit time: 15.317 seconds
Performing stepwise search to minimize aic
 ARIMA(0,0,0)(1,1,1)[7] intercept   : AIC=693.362, Time=0.22 sec
 ARIMA(0,0,0)(0,1,0)[7] intercept   : AIC=838.120, Time=0.02 sec
 ARIMA(1,0,0)(1,1,0)[7] intercept   : AIC=697.102, Time=0.12 sec
 ARIMA(0,0,1)(0,1,1)[7] intercept   : AIC=659.254, Time=0.17 sec
 ARIMA(0,0,0)(0,1,0)[7]             : AIC=836.266, Time=0.01 sec
 ARIMA(0,0,1)(0,1,0)[7] intercept   : AIC=786.790, Time=0.03 sec
 ARIMA(0,0,1)(1,1,1)[7] intercept   : AIC=inf, Time=0.32 sec
 ARIMA(0,0,1)(0,1,2)[7] intercept   : AIC=660.652, Time=0.40 sec
 ARIMA(0,0,1)(1,1,0)[7] intercept   : AIC=703.535, Time=0.12 sec
 ARIMA(0,0,1)(1,1,2)[7] intercept   : AIC=661.087, Time=0.62 sec
 ARIMA(0,0,0)(0,1,1)[7] intercept   : AIC=691.365, Time=0.13 sec
 ARIMA(1,0,1)(0,1,1)[7] intercept   : AIC=651.772, Time=0.59 sec
 ARIMA(1,0,1)(0,1,0)[7] intercept   : AIC=776.733, Time=0.05 sec
 ARIMA(1,0,1)(1,1,1)[7] intercept   : AIC=653.021, Time=0.45 sec
 ARIMA(1,0,1)(0,1,2)[7] intercept   : AIC=653.193, Time=0.65 sec
 ARIMA(1,0,1)(1,1,0)[7] intercept   : AIC=699.023, Time=0.17 sec
 ARIMA(1,0,1)(1,1,2)[7] intercept   : AIC=654.398, Time=1.05 sec
 ARIMA(1,0,0)(0,1,1)[7] intercept   : AIC=650.405, Time=0.25 sec
 ARIMA(1,0,0)(0,1,0)[7] intercept   : AIC=774.850, Time=0.02 sec
 ARIMA(1,0,0)(1,1,1)[7] intercept   : AIC=inf, Time=0.28 sec
 ARIMA(1,0,0)(0,1,2)[7] intercept   : AIC=651.635, Time=0.44 sec
 ARIMA(1,0,0)(1,1,2)[7] intercept   : AIC=653.122, Time=0.71 sec
 ARIMA(2,0,0)(0,1,1)[7] intercept   : AIC=651.698, Time=0.23 sec
 ARIMA(2,0,1)(0,1,1)[7] intercept   : AIC=653.118, Time=0.48 sec
 ARIMA(1,0,0)(0,1,1)[7]             : AIC=649.900, Time=0.10 sec
 ARIMA(1,0,0)(0,1,0)[7]             : AIC=772.959, Time=0.00 sec
 ARIMA(1,0,0)(1,1,1)[7]             : AIC=651.492, Time=0.15 sec
 ARIMA(1,0,0)(0,1,2)[7]             : AIC=651.574, Time=0.22 sec
 ARIMA(1,0,0)(1,1,0)[7]             : AIC=695.239, Time=0.05 sec
 ARIMA(1,0,0)(1,1,2)[7]             : AIC=652.415, Time=0.47 sec
 ARIMA(0,0,0)(0,1,1)[7]             : AIC=692.853, Time=0.06 sec
 ARIMA(2,0,0)(0,1,1)[7]             : AIC=651.069, Time=0.17 sec
 ARIMA(1,0,1)(0,1,1)[7]             : AIC=651.142, Time=0.17 sec
 ARIMA(0,0,1)(0,1,1)[7]             : AIC=659.485, Time=0.11 sec
 ARIMA(2,0,1)(0,1,1)[7]             : AIC=652.497, Time=0.28 sec

Best model:  ARIMA(1,0,0)(0,1,1)[7]          
Total fit time: 9.315 seconds

Selecting the best model#

Given the previous results, in this section we get the best model, that is, the one with the least inner error.

model_names = np.array(list(scores.keys()))
inner_scores_values = np.array(list(scores.values()))
best_model_CV = model_names[np.argmin(inner_scores_values)]

plt.figure(figsize=(7, 11))
ax = sns.scatterplot(x=inner_scores_values, y=model_names, color='blue', s=95)
ax = sns.scatterplot(x=np.min(inner_scores_values), 
                     y=[best_model_CV], color='red', s=95)
plt.title(f'Model Selection - Daily Wind Speed (m/s) \n\n {n_splits} Fold CV - Test window {test_window} days', size=15, weight='bold')
ax.set_ylabel('Models', size=13)
ax.set_xlabel('MAE', size=11)
min = np.min(inner_scores_values)
max = np.max(inner_scores_values)
plt.xticks(np.round(np.linspace(min,max, 5), 3), fontsize=10)
plt.yticks(fontsize=12)
plt.show()
print('The best model according to cross validation is', best_model_CV)

_images/942cdaa86c9ba86b5a78ec341304dc8504d4eb4b385e6a555e450558740f0470.png

The best model according to cross validation is SARIMA(p=2, q=1)

According to cross validation with a test window fo 15 days, the best model for forecasting the wind speed in Jena is the SARIMA \(\small{(p=2, d=0, q=1)\times(P=0, D=0, Q=0)}_{s=0}\)

We can see again that the lags have an important influence in the forecasting results.

Predictive visualization#

SARIMA \((p=2, d=0, q=1)\times(P=0, D=0, Q=0)_{s=0}\)

estimator = sarima[series][5]
model_name = str(sarima[series][5])

KFold_time_series_plot(n_cols=3, figsize=(20,15), estimator=estimator, 
                       X=X_train_st[series], y=Y_train_st[series], n_splits=n_splits, test_window=test_window, score=mean_absolute_error, dates=dates,
                       true_color='blue', pred_color='red', marker='', markersize=4,
                       title=f"Forecasting Daily Avg. Wind Speed (m/s) - Forecast window of {forecast_window} days - {n_splits} Fold CV - MAE = {np.round(scores[model_name],3)}\n\n{model_name}",
                       hspace=0.35, wspace=0.1, subtitles_size=12, title_size=15, xticks_rotation=25, 
                       bbox_to_anchor=(0.5,0.05))

Fold's size: 290. Train size: 275. Test size: 15

_images/65d30ae53fe4fdfffd41b0bf1c1e6ade50fa6339b9cb746f4a0c1ac16b95f1c5.png

Linear regression with 3 lags

estimator = linear_regression
lag = 3
model_name = f'Linear Regression (lag={lag})'

KFold_time_series_plot(n_cols=3, figsize=(20,15), estimator=estimator,  score=mean_absolute_error,
                       X=X_train_sk[series][lag], y=Y_train_sk[series][lag], n_splits=n_splits, test_window=test_window, dates=dates,
                       true_color='blue', pred_color='red', marker='', markersize=4,
                       title=f"Forecasting Daily Avg. Wind Speed (m/s) - Forecast window of {forecast_window} days - {n_splits} Fold CV - MAE = {np.round(scores[model_name],3)}\n\n{model_name}",
                       hspace=0.35, wspace=0.1, subtitles_size=12, title_size=15, xticks_rotation=25, 
                       bbox_to_anchor=(0.5,0.05))

Fold's size: 290. Train size: 275. Test size: 15

_images/e51990b588c7dc3e9a84a490850c9176e31229db3a405ef5b5112a1ec1278b60.png

Estimation of future performance#

We estimate the future performance of the best model according to KFold Cross Validation.

best_model_CV

'SARIMA(p=2, q=1)'

lag = 3
linear_regression.fit(X=X_train_sk[series][lag], y=Y_train_sk[series][lag])
Y_test_hat = linear_regression.forecast(window=test_window)
estimation_future_performance = mean_absolute_error(y_pred=Y_test_hat, y_true=Y_test_sk[series][lag])
estimation_future_performance

0.8715102370962747

Conclusions#

In this project we have look for a good model for forecasting the temperature, humidity and wind speed of an specific region of Germany. We have done that focus on statistical techniques, like the family of models SARIMA, exponential smoothing, KNN and linear regression.
We have been focus on the manual SARIMA, that is, in the specification of the parameters of this model in a manual way, by exploring and analyzing the auto correlations of the series and their trend and seasonality.
We have also implemented some code oriented to apply the so popular and well developed Machine Learning library sklearn to time series. How powerful this could be will be show in the next project with more details. This developments are contained in PyTS.py, and will be improve for the next project.
In addition, we have analyzed the two already classic ways of comparing models in time series, simple validation and cross validation. And we had reach the conclusion that the second one is a much more reliable option.
At the end we have obtained three models for forecasting our variables, the ones are not too bad, but can be improved with a more exhaustive search, based on ML model and techniques, as we will see in the next project. So, we consider this project as a first step for a better upcoming project (in term of expected results).

Temperature Forecasting: Statistical Approach

Contents

Temperature Forecasting: Statistical Approach#

Requirements#

Data#

Conceptual description#

Preprocessing the data#

yearly temperature time series#

Plotting#

Monthly temperature time series#

Plotting#

Daily time series#

Plotting#

Forecasting Daily Temperature, Humidity and Wind Speed#

Response and Predictors (lags)#

Characterizing SARIMA models#

Time series decomposition#

Auto-correlation Plots#

Regular part \((p,q)\)#

Seasonal part \((P,Q)\)#

Augmented Dickey Fuller Test#

Final specification#

Specifying the models#

Time Series Train-Test split#

Temperature (T)#

Simple Validation#

Predictive visualization#

Selecting the best model#

Cross Validation#

Selecting the best model#

Predictive visualization#

Estimation of future performance#

Humidity (rh)#

Cross Validation#

Selecting the best model#

Predictive visualization#

Estimation of future performance#

Wind Speed (wv)#

Cross Validation#

Selecting the best model#

Predictive visualization#

Estimation of future performance#

Conclusions#

Temperature (`T`)#

Humidity (`rh`)#

Wind Speed (`wv`)#