Hierarchical Forecasting in Python

In complex datasets, forecasts at detailed levels (e.g., regions, products) should align with higher-level forecasts (e.g., countries, categories). Inconsistent forecasts can lead to poor decisions.

Hierarchical forecasting ensures forecasts are consistent across all levels to reconcile and match forecasts from lower to higher levels.

HierarchicalForecast from Nixtla is an open-source library that provides tools and methods for creating and reconciling hierarchical forecasts

For illustrative purposes, consider a sales dataset with the following columns:

  • Country: The country where the sales occurred.
  • Region: The region within the country.
  • State: The state within the region.
  • Purpose: The purpose of the sale (e.g., Business, Leisure).
  • ds: The date of the sale.
  • y: The sales amount.
import numpy as np
import pandas as pd

Y_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/tourism.csv')
Y_df = Y_df.rename({'Trips': 'y', 'Quarter': 'ds'}, axis=1)
Y_df.insert(0, 'Country', 'Australia')
Y_df = Y_df[['Country', 'State', 'Region', 'Purpose', 'ds', 'y']]
Y_df['ds'] = Y_df['ds'].str.replace(r'(\d+) (Q\d)', r'\1-\2', regex=True)
Y_df['ds'] = pd.to_datetime(Y_df['ds'])
Y_df.head()
CountryStateRegionPurposedsy
AustraliaSouth AustraliaAdelaideBusiness1998-01-01135.077690
AustraliaSouth AustraliaAdelaideBusiness1998-04-01109.987316
AustraliaSouth AustraliaAdelaideBusiness1998-07-01166.034687
AustraliaSouth AustraliaAdelaideBusiness1998-10-01127.160464
AustraliaSouth AustraliaAdelaideBusiness1999-01-01137.448533

The dataset can be grouped in the following non-strictly hierarchical structure:

  • Country
  • Country, State
  • Country, Purpose
  • Country, State, Region
  • Country, State, Purpose
  • Country, State, Region, Purpose
spec = [
    ['Country'],
    ['Country', 'State'], 
    ['Country', 'Purpose'], 
    ['Country', 'State', 'Region'], 
    ['Country', 'State', 'Purpose'], 
    ['Country', 'State', 'Region', 'Purpose']
]

Using the aggregate function from HierarchicalForecast we can get the full set of time series.

from hierarchicalforecast.utils import aggregate

Y_df, S_df, tags = aggregate(Y_df, spec)
Y_df = Y_df.reset_index()
Y_df.sample(10)
unique_iddsy
12251Australia/New South Wales/Outback NSW/Business2000-10-01
33131Australia/Western Australia/Australia’s North2000-10-01
22034Australia/South Australia/Fleurieu Peninsula/Other2006-07-01
31119Australia/Victoria/Phillip Island/Visiting2017-10-01
7671Australia/New South Wales/Other2015-10-01
18339Australia/Queensland/Mackay/Business2002-10-01
23043Australia/South Australia/Limestone Coast/Visiting1998-10-01
22129Australia/South Australia/Fleurieu Peninsula/Visiting2010-04-01
11349Australia/New South Wales/Hunter/Business2015-04-01
16599Australia/Queensland/Brisbane/Other2007-10-01

Get all the distinct ‘Country/Purpose’ combinations present in the dataset:

tags['Country/Purpose']
array(['Australia/Business', 'Australia/Holiday', 'Australia/Other',
       'Australia/Visiting'], dtype=object)

We use the final two years (8 quarters) as test set.

Y_test_df = Y_df.groupby('unique_id').tail(8)
Y_train_df = Y_df.drop(Y_test_df.index)

Y_test_df = Y_test_df.set_index('unique_id')
Y_train_df = Y_train_df.set_index('unique_id')

Y_train_df.groupby('unique_id').size()
unique_idcount
Australia72
Australia/ACT72
Australia/ACT/Business72
Australia/ACT/Canberra72
Australia/ACT/Canberra/Business72
Australia/Western Australia/Experience Perth/Other72
Australia/Western Australia/Experience Perth/Visiting72
Australia/Western Australia/Holiday72
Australia/Western Australia/Other72
Australia/Western Australia/Visiting72

The following code generates base forecasts for each time series in Y_df using the ETS model. The forecasts and fitted values are stored in Y_hat_df and Y_fitted_df, respectively.

%%capture
from statsforecast.models import ETS
from statsforecast.core import StatsForecast

fcst = StatsForecast(df=Y_train_df,
models=[ETS(season_length=4, model='ZZA')],
freq='QS', n_jobs=-1)
Y_hat_df = fcst.forecast(h=8, fitted=True)
Y_fitted_df = fcst.forecast_fitted_values()

Since Y_hat_df contains forecasts that are not coherent—meaning forecasts at detailed levels (e.g., by State, Region, Purpose) may not align with those at higher levels (e.g., by Country, State, Purpose)—we will use the HierarchicalReconciliation class with the BottomUp approach to ensure coherence.

from hierarchicalforecast.methods import BottomUp
from hierarchicalforecast.core import HierarchicalReconciliation

reconcilers = [BottomUp()]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df, Y_df=Y_fitted_df, S=S_df, tags=tags)

The dataframe Y_rec_df contains the reconciled forecasts.

Y_rec_df.head()
unique_iddsETSETS/BottomUp
Australia2016-01-0125990.06835924380.257812
Australia2016-04-0124458.49023422902.765625
Australia2016-07-0123974.05664122412.982422
Australia2016-10-0124563.45507823127.439453
Australia2017-01-0125990.06835924516.759766

Link to Hierarchical Forecast

Open In Colab

What is the Bottom-Up Approach?

The bottom-up approach is a method where forecasts are initially created at the most granular level of a hierarchy and then aggregated up to higher levels. This approach ensures that detailed trends at lower levels are captured and accurately reflected in higher-level forecasts. It contrasts with top-down methods, which start with aggregate forecasts and distribute them downwards.

Steps in the Bottom-Up Approach

Forecast at the Lowest Level

First, forecasts are created at the most detailed level: Country, State, Region, Purpose. For example, the forecast for the next date might look like this:

CountryStateRegionPurposedsy_forecast
USANYEastBusiness2023-01-02105
USANYEastLeisure2023-01-0285
USANJEastBusiness2023-01-0295
USANJEastLeisure2023-01-0275
USACAWestBusiness2023-01-02125
USACAWestLeisure2023-01-02115
USANVWestBusiness2023-01-0265
USANVWestLeisure2023-01-0255

Country, State, Purpose

Sum the forecasts for each Country, State, Purpose combination.

CountryStatePurposedsy_forecast
USANYBusiness2023-01-02105
USANYLeisure2023-01-0285
USANJBusiness2023-01-0295
USANJLeisure2023-01-0275
USACABusiness2023-01-02125
USACALeisure2023-01-02115
USANVBusiness2023-01-0265
USANVLeisure2023-01-0255

Country, State, Region

Sum the forecasts for each Country, State, Region combination.

CountryStateRegiondsy_forecast
USANYEast2023-01-02190
USANJEast2023-01-02170
USACAWest2023-01-02240
USANVWest2023-01-02120

Country, Purpose

Sum the forecasts for each Country, Purpose combination.

CountryPurposedsy_forecast
USABusiness2023-01-02390
USALeisure2023-01-02330

Country

Sum the forecasts for the entire Country.

Countrydsy_forecast
USA2023-01-02720
Scroll to Top

Work with Khuyen Tran

Work with Khuyen Tran