From Forecast to Fable: Advanced Time Series Analysis in R
I watched Mitchell O’Hara-Wild’s talk on Fable, which provided a comprehensive examination of the evolution from the forecast
package to Fable
in R, highlighting the design decisions that were made to improve time series analysis.
Quantitative Focus on Design Decisions
-
Package Evolution: Starting with the
TS
package at the turn of the century, it brought basic functions such as ARIMA and seasonality filters. Theforecast
package, introduced in 2003 by Rob Hyndman, advanced this by adding the Exponential Smoothing State Space Model (ETS) and other methodologies. -
Transition to Fable: The
Fable
package, initiated in 2018, arises as a successor toforecast
, designed to be more compatible with the Tidyverse. The core of this transition tackles the complexity of modern time series data, which often involves multiple observations collected at high frequency (e.g., every 30 minutes for electricity demand). -
Object Types: The
forecast
package works withTS
objects which are limited to single time series and require a conversion that can introduce inefficiencies. The Fable package utilizes tibbles, allowing users to engage with multi-series time data more fluidly. For instance, consider observing the number of trips to Australia categorized by purpose (e.g., business, holiday, family). Withforecast
, you’d require heavy lifting to represent this multivariate time series effectively. -
Ease of Use and Accessibility: Comparable to
forecast
, which can only manage one time variable at a time, Fable allows you to operate directly on tibbles. The inherent structure of a tibble (like a data frame) inherently supports familiar operations (filter
,mutate
, etc.), which need no additional learning curve to engage with. -
Distribution Handling:
forecast
traditionally works by providing point forecasts and confidence intervals for predictions. By contrast,Fable
goes beyond this, producing entire probability distributions for forecasts. This enhancement can yield meaningful insights, such as understanding potential variability in predicted outcomes—vital for businesses evaluating risks tied to forecasting. -
Modeling Interface: A distinguishing factor is how both packages handle model specification and application. The
forecast
package models, such as ETS or TSLM, employ a diverse range of interfaces making inversion from one model type to another jarring. Fable opts for a consolidated approach, allowing users to specify models and fit them seamlessly within a uniform interface. -
Actions on Outputs: Outputs from
forecast
require knowing the internal structure of objects, with potentially confusing components (e.g., mean forecasts being misrepresented due to terminology like “bias adjust”).Fable
, however, outputs are designed intuitively, allowing modifications directly to forecast distributions without needing to re-run the models.
Comparison in Numbers:
- If forecasting the number of annual trips to Australia involves a
TS
object, where each entry is a singular point in time, the cumbersome requirement to iterate manually each quarter can become an operational burden—especially if you need to visualize three types of trips with three different models. In Fable, processing time series for multiple purposes simultaneously becomes an extensible operation, streamlined through tidy data principles, allowing for aggregate results with simple Tidyverse commands.
-
Accuracy Assessment: The
accuracy
function inforecast
provides metrics on a per-model basis, promoting redundant calculations among models. Conversely, using Fable, one can compile measures for multiple models simultaneously, visualizing results in familiar data structures withggplot
, thus enhancing decision-making based on comparative metrics. -
Extensibility: An important design principle exposed was the concept of building blocks. Fable allows custom accuracy measures to be plugged into existing frameworks, thereby accommodating specific business needs or unique KPIs without needing to re-engineer the package architecture.
Key Design Takeaways:
- Leverage user familiarity with Tidyverse: Build on existing knowledge.
- Ensure consistent interfaces that facilitate easy comparisons.
- Use descriptive terminology to avoid confusion—metrics should clearly represent their purpose.
- Build with extensibility in mind; enable users to create custom functions.
Such considerations reshape statistical modeling in R, making Fable
not merely a package but a significant evolution in how practitioners can engage in sophisticated time series analysis more intuitively and effectively.