Real time nowcasting and forecasting give decision makers the best possible information to guide strategy, whether that strategy informs investment portfolios or public policy. The following example of a GDP nowcast (a prediction of what GDP will be this quarter) uses the most recent data published by the U.S. Federal Reserve at a weekly frequency. In addition to predictions, you can find the contribution of each data type to forecast revisions. Notably, all of the software we use --- from statistical modeling in R to our web interface using Chart.js is open source under an MIT or similar license.

Quarterly GDP predictions including fitted historical values for evaluating model performance. GDP is measured here in seasonally adjusted annualized growth rates.

Update contribution by data type to the current quarter nowcast; see methodology for more detail. Note that we only display updates for data that have already been realized this quarter; updates for data types not yet realized are zero.

Update contribution by data type to the previous quarter nowcast; see methodology for more detail

Our nowcasts and forecasts are derived from Bayesian statistical models estimated by simulation. These models begin with a forecast for next period values --- GDP growth next quarter for example. Then, as more data is released throughout the quarter, our forecast gets updated. These updates are what we show in the Current Quarter Analysis and Previous Quarter Analysis tabs above. In the case of GDP, updates are processed weekly when key weekly statistics such as initial unemployment claims are released. This updating process uses the fact that, for a multivariate normal distribution, the distribution of one set of variables (our forecast) conditional on another observed set of variables (data used for updating) is also multivariate normal. This is the result used to derive the Kalman filter (one of the key forecasting tools we use), though it can be applied more generally. You can find all the gritty technical details in Practical Implementation of Factor Models.

This conditionally linear-Gaussian framework has both advantages and drawbacks compared to alternative methods such as artificial neural networks. The great drawback is that, conditional on the state of the system, the model is linear-Gaussian. Note, however, that both the assumptions of linearity and normally distributed disturbances are conditional: data itself need not be normally distributed. Even in the simplest state space framework (using the predicting then updating process described above) the (linear) contribution of each series used to update forecasts depends on the order we introduce the data. That is, the model is only linear given the current state of the economy. The advantages of this conditionally linear-Gaussian framework include the following: restricting the functional space of the model requires less training data, reduces computation time, and is particularly adept at handling multiple time series; we always know what is driving results; we can access the full posterior distribution for predictions (even when mixing models); models can be rigorously back-tested to give a good idea of how they may perform going forward.

Backtesting allows us to asses how our model might do going forward in the future by looking at how our model would have performed in the past, given only information that was available then. This is the most computationally intensive part of our modeling process as it requires re-estimating the model at every testing date, but also one of the most important. Getting a good in-sample fit is easy --- depending on the model one can simply add parameters until the fit is perfect. What is important from a decision making point of view is how the model will perform in the future.

Because all of our models are Bayesian we can calculate the full posterior distribution for predictive statistics. Thus, not only do we get a forecast, we get a clear picture of uncertainty around predicted values.

Even if we are estimating unobserved factors, our modeling assumptions mean we always know what is driving the results. This allows for greater insight into the model: Is a GDP prediction driven by real or nominal factors? Is production down or are inventories stacking up? Additionally, if a prediction looks off, we will know why.

Restricting the functional space of the model greatly reduces the time it takes to get results. Additionally, while we use R to manage data, we use C++ for actual computation. This means that the time needed to get results is measured in seconds, not days.

These examples are run on Amazon Web Services EC2 and stored in an S3 bucket; AWS is the only subscription service in the whole process from accessing data (via the St. Louis Fed’s FRED API) to plotting nowcasts (using Chart.js). The software used in this example includes:

R is the premier statistical programming language. Using R means you are never tied to proprietary software, and there is a large and competitive market for programmers to help maintain your framework. R has a huge library of packages, and many of these are particularly well suited to data management (we use data.table). For estimation routines, however, we like to write our own scripts for accountability, flexibility, and to minimize the number of dependencies our routines need.

C++ is a compiled language, meaning writing code is more time consuming but once compiled it is unparalleled in speed and efficiency. This speed and efficiency is the reason our statistical routines are all written in C++; we are able to handle bigger models faster than more user friendly programming languages. Plus, thanks to the Rcpp R package it interfaces beautifully with R.

Armadillo is a linear algebra library for C++. Its syntax is user friendly and easy to follow without compromising computing power. It also is R friendly thanks to the package RcppArmadillo. Eigen is an alternative linear algebra library for C++ which is also has attractive features; either is a good choice.

Perhaps surprisingly, JavaScript charting software is the one place where things can get expensive quickly for commercial use. Chart.js offers an MIT License solution. While not as flexible as some of the commercial software, Chart.js simple, elegant, and user friendly with plenty of examples on GitHub --- you can judge the results here for yourself!

seth@srlanalytics.com

+1 (603) 229-2095