Curated Data Science by Rahul

From Forecast to Fable: Advanced Time Series Analysis in R

I watched Mitchell O’Hara-Wild’s talk on Fable, which provided a comprehensive examination of the evolution from the forecast package to Fable in R, highlighting the design decisions that were made to improve time series analysis.

Quantitative Focus on Design Decisions

  1. Package Evolution: Starting with the TS package at the turn of the century, it brought basic functions such as ARIMA and seasonality filters. The forecast package, introduced in 2003 by Rob Hyndman, advanced this by adding the Exponential Smoothing State Space Model (ETS) and other methodologies.

  2. Transition to Fable: The Fable package, initiated in 2018, arises as a successor to forecast, designed to be more compatible with the Tidyverse. The core of this transition tackles the complexity of modern time series data, which often involves multiple observations collected at high frequency (e.g., every 30 minutes for electricity demand).

  3. Object Types: The forecast package works with TS objects which are limited to single time series and require a conversion that can introduce inefficiencies. The Fable package utilizes tibbles, allowing users to engage with multi-series time data more fluidly. For instance, consider observing the number of trips to Australia categorized by purpose (e.g., business, holiday, family). With forecast, you’d require heavy lifting to represent this multivariate time series effectively.

  4. Ease of Use and Accessibility: Comparable to forecast, which can only manage one time variable at a time, Fable allows you to operate directly on tibbles. The inherent structure of a tibble (like a data frame) inherently supports familiar operations (filter, mutate, etc.), which need no additional learning curve to engage with.

  5. Distribution Handling: forecast traditionally works by providing point forecasts and confidence intervals for predictions. By contrast, Fable goes beyond this, producing entire probability distributions for forecasts. This enhancement can yield meaningful insights, such as understanding potential variability in predicted outcomes—vital for businesses evaluating risks tied to forecasting.

  6. Modeling Interface: A distinguishing factor is how both packages handle model specification and application. The forecast package models, such as ETS or TSLM, employ a diverse range of interfaces making inversion from one model type to another jarring. Fable opts for a consolidated approach, allowing users to specify models and fit them seamlessly within a uniform interface.

  7. Actions on Outputs: Outputs from forecast require knowing the internal structure of objects, with potentially confusing components (e.g., mean forecasts being misrepresented due to terminology like “bias adjust”). Fable, however, outputs are designed intuitively, allowing modifications directly to forecast distributions without needing to re-run the models.

Comparison in Numbers:

  1. Accuracy Assessment: The accuracy function in forecast provides metrics on a per-model basis, promoting redundant calculations among models. Conversely, using Fable, one can compile measures for multiple models simultaneously, visualizing results in familiar data structures with ggplot, thus enhancing decision-making based on comparative metrics.

  2. Extensibility: An important design principle exposed was the concept of building blocks. Fable allows custom accuracy measures to be plugged into existing frameworks, thereby accommodating specific business needs or unique KPIs without needing to re-engineer the package architecture.

Key Design Takeaways:

Such considerations reshape statistical modeling in R, making Fable not merely a package but a significant evolution in how practitioners can engage in sophisticated time series analysis more intuitively and effectively.