Recent Posts

Wednesday, July 10, 2019

Recession Probability Forecasting Models

This article discusses a wide class of models: models which attempt to offer a recession probability estimate, based on variables that are not just aggregate activity variables. Models with inputs that are aggregate economic variables were the subject of a previous article and can be viewed as offering an alternative recession definition. Therefore, they are essentially coincident indicators of recessions – although they might offer a recession diagnosis earlier than “official” recession determinations are made. Instead, the models of interest here are those that use variables that are believed to have some leading information, and so can offer an inflation forecast ahead of the actual recession start.

(Note: This is an unedited excerpt of a section of a manuscript of the first volume of a book on recessions. It is an expanded version of a previously published article on my website.)

The comments here are relatively generic. The reason is that the yield curve – technically, slopes between different tenors within the overall yield curve – is typically a dominant input to these models. As a result, any discussion of these models in practice entails a deep focus on the details of yield curve behaviour. As an ex-fixed income analyst, that is a subject of intense interest to me. The discussion of the yield curve will take an entire chapter in the second volume of this text. By deferring the discussion of the yield curve in this fashion, it is possible to give an overview of the basic structure of such models without being bogged down in a lengthy technical discussion.

Model Inputs

If we build a model that is solely based on aggregate economic activity variables, it is using the exact same series that are normally used for recession dating. So, it might have some advantages than the “official” recession-dating procedure, but it is not adding new sources of information. The models discussed here also include variables that are believed to have leading information. Some main examples include the following.
  • Surveys, like those of purchasing managers.
  • Financial market data, particularly the previously mentioned yield curves. Other popular choices include stock markets and credit spreads. (The stock market’s mixed predictive power for recessions was best captured by the famous quote attributed to Paul Samuelson: “The stock market has predicted nine of the past five recessions.” [i])
  • Credit aggregate growth. (I used to work at a firm called BCA Research, where BCA originally stood for “Bank Credit Analyst.” This name was chosen in the early post-war period as the analysis of credit growth was a key concern.)
  • Data from particular sectors that are believed to lead the economy, such as residential construction. This could possibly include economic aggregates for selected regions.
  • Aggregated indicators that are based on a wide selection of series, such as a principal component analysis (PCA) output, or traditional leading indicators. These are constructed from a mix of activity/leading variables.

Putting Variables Together

The first thing to note is that we are interested in giving an assessment of recession risk. This could potentially be done two ways: create some form of a probability model or have an indicator that offers a forecast for aggregate real GDP growth. The second implicitly gives a recession risk forecast, if we assume that declining real GDP translates into a recession (e.g., the two consecutive declining quarters definition of a recession). However, forecasting a recession event accurately seems in principle to be simpler than forecasting real GDP growth, so we will assume that we are working with recession probabilities.

The next technical issue is the notion of “recession probability” estimates. One could have a model that generates explicit recession probabilities, like a probit model. However, if one looks at financial market commentary, we see that practitioners use a looser definition. A typical assessment one encounters in commentary will often resemble the following: “The last {N} times {a condition} happened, a recession followed within six months.”

This is obviously less academic sounding than a probit model, and looks less mathematically robust (what are the odds that this time is different?). However, I see little value in being too much of an academic purist about this point: we just have some non-quantifiable uncertainty to deal with, which is a factor in real world decision-making. (Keynes wrote A Treatise on Probability, and post-Keynesians have long discussed this issue. This controversy is a considerable detour, so I will not pursue the matter here.)

We can generate these implicit recession probabilities by either:
  • directly relating the raw series to a recession indicator variable (e.g., probit model), or
  • create an aggregate indicator, and relate some events associated with that indicator to recession events. For example, if the indicator drops below a threshold, a recession is predicted.
As for aggregation, there are a few methods. Standard choices include a principal component analysis, or the more ad hoc method of creating a weighted average of “normalised” variables. I will outline these in turn.

Principal Component Analysis

Figure: Chicago Fed National Activity Indicator

The figure above shows one of the more well-known PCA models, the Chicago National Activity Index, from the Chicago Federal Reserve Bank. The 3-month average is shown, which is the usual method of presentation. As can be seen, it dips around the time of recessions. One could attempt to infer a “recession threshold” based on these historical episodes.

The mathematics behind principal component analysis is somewhat advanced. Since it is a key part of yield curve analysis, I will defer a technical description until that chapter. However, I would caution against assuming that this mathematical sophistication should be confused for model accuracy. My opinion is that graduate schools spend too much time on the quantifiable statistical properties of PCA analysis, and not on the more fundamental problem of model risk.[ii] Any statistical procedure is based on an implicit mathematical model of the process generating the data, and that model can be outright incorrect. Instead, I would argue that users need to only have a rough handle on the properties of PCA solutions, and then ask themselves: what can go wrong? (The answer is: a lot.)

That said, the more rigorous PCA approaches have a practical advantage when we consider the need to transform variables. As will be discussed below, variables need to be transformed before they are input into the PCA algorithm. Statisticians have found standard procedures to apply such transformations, which eliminates the temptation to overfit historical data by tweaking the transformation technique.

“Normalisation”

One common way to aggregate multiple series is to “normalise” them (or “standardise”), which is a shorthand used in market analysis that describes a two-step process.
  1. We calculate the mean and the standard deviation of the time series. (The term “normalisation” appears to refer to the normal – or Gaussian – probability distribution, which is entirely described by its mean and standard deviation. However, the procedure is not really based on the assumption that probability distributions are normal. “Standardisation” is a better term as a result.)
  2. For any given point in time, the value of the “normalised” series is equal to the deviation of the series from its mean, divided by its standard deviation. For example, if a series is one standard deviation above its mean, the “normalised” series has a value of 1.
We create the aggregate by adding up a weighted average of the normalised series. Since each normalised series is expressed as units of standard deviations, this step appears plausible.

I will now run through an example indicator for the United States, based on two variables. The variables are the number of employed persons (nonfarm payroll employment) and the 2-/10-year Treasury slope. I have chosen these two variables for illustrative purposes only, for reasons that will clear later. I want to underline that the resulting indicator has various defects – that will be discussed later – and is not being held forth as a useful recession prediction tool.

I will fix my sample period to be 1977-2018, using monthly data.
Figure: 2-/10-year slope

The chart above shows the results for the 2-/10-year Treasury slope (based on data from the Federal Reserve H.15 report). The top panel shows the raw data, which is the slope in basis points. The bottom panel is the standardised version: the slope minus its period mean (95.4 basis points), divided by its standard deviation (91.8 basis points). A scan be seen, the slope near the end of the in-sample period converged to zero, which is roughly one standard deviation below its average.
Chart: Nonfarm Employment

The Total Nonfarm Employment series (above) from the Bureau of Labor Statistics (BLS) gives a good example of the need for transformation of series (which is why the series was chosen). Unlike the 2-/10-year slope, the total number of people employed rises in a trending fashion over time. We see dips around recessions, but the standardised series has a trend rise like the underlying series. The information conveyed around recessions is the reversal of the rise.

We can capture the reversal in a number of ways. The most familiar would be to express the growth rate as a percentage change. The problem is that if we take a short period to express the change – for example, the percentage change each month – the resulting series is typically noisy (not shown) If we take the change versus the year before, we are susceptible to “base effects”: if there was a large change a year ago, the dramatic change in the base causes jumps even when the current month shows little change. A more robust version is the deviation from the 12-month moving average (often referred to as “the deviation from trend.”
Figure: Employment Deviation From Trend

As shown above, this transformed series now behaves in a similar fashion as the yield curve or the Chicago Fed National Activity Indicator: there is no time trend. A value of 1.01 means that the current level is 1% above its 12-month moving average. As can be seen, the deviation from trend is smooth (like the underlying series), which is unlike a jerkier percentage change series.
Figure: Combined Indicator

The chart above shows how the combined indicator. The top panel shows the two standardised series separately, the bottom shows the average of the two. As can be seen, the yield curve tends to lead the employment series, and so combining in this fashion is perhaps not the most sensible option. However, they illustrate the issues around series transformation – we can just take the yield curve directly, while we had to transform the employment series.

 One issue that I skipped over was the problem with using the entire period to calculate the mean and standard deviation. Although this was simple to explain, it is effectively cheating: we use future information to calculate the standardised variable.
Figure: Standardisation of Yield

The figure above shows what happens when we “standardise” the 10-year Treasury yield over the period 1970-1982. As is well known, bonds suffered a secular bear market during this period (yield rising). The standardised variable shows the yield trading about one standard deviation below its mean in the early 1970s, which would be interpreted as being quite expensive. However, this expensive valuation is dependent upon us knowing the average yield for all of 1970-1982: which means we would have needed to have forecast the future path of interest rates to know that they were below average. Using systems engineering terminology, this is a non-causal series transformation: the value at a certain time is dependent upon future values. In order to create an indicator that could be used in real time, we would need to use historical information to “standardise” variables.

Inherent Limitations

The rest of this section will focus on the limitations of such models. The reason for discussing limitations rather than the advantages of such models is based on my assessment of how users perceive them. People are attracted to mathematical models, and the more complex, the better. The problem is that the mathematical complexity distracts us from the underlying structure of the model.

Training Against Isolated Data Points

One structural issue we face with such models is that they are attempting to match a binary variable -- is the economy in recession? In the early post-World War II decades, recessions were more frequent, and so there are a lot of episodes to fit our indicators against. However, since 1990, recessions have hit the United States roughly once per decade. (The situation is worse for Australia, which has largely managed to avoid recession at this time.) If we accept that there may be structural changes in the economy that leads to the choice of indicator variables changing, we are searching data sets for events that happened only a few times. The obvious risk is that we end up searching for variables that will act exactly like they did in the various post-1990 recessions, and we may be looking for an exact repeat of history.

The next issue is that we are sensitive to the methodology used for recession dating. In the United States, I would argue that one theme of the literature is that the NBER recession dates are fairly robust to changes to the methodology of the analysis of activity variables (so far). When we start to look at other countries, we may find that recession dating is more controversial, which obviously casts questions on the validity of models trained on any particular data set.

For the post-1990 period, we need to be concerned about the nature of recessions. For the United States at least, recessions coincided with some form of a financial crisis. Although I believe that there are good theoretical reasons for financial crises to cause recessions, we only need to look at the early post-World War II period to see examples of recessions that happened independent of financial crises. As was observed by Minsky at the time, the post-war financial system was exceedingly robust, as a result of a strong regulatory regime and a cautious private sector (which was highly scarred by the Great Depression). One could argue that regulators and credit investors did learn something from the Financial Crisis, and so it is entirely possible that we can have a recession without large financial entities blowing themselves up.

Finally, there is the issue of small dips in activity -- technical recessions. I discussed technical recessions in an earlier article. If there is such a dip, is a signal provided by an indicator really a false positive, even if the NBER does not call a recession? This is the inherent problem with such binary signals; if we have a model that predicts growth rates, it should be able to distinguish a dip from a full-fledged recession.

Economic Structural Changes

If we attempt to train our model against a long run of data, we will be covering differing economic regimes. This is not an issue for activity-based model, since recessions are defined by declines in the same set of economic activity variables (employment, etc.). For the United States, the secular decline in the manufacturing sector has meant that previously important manufacturing variables are less useful indicators for aggregate activity.

   Even if manufacturing employment dips by 10%, that's now a drop in the bucket compared to employment in the service sector.

Regional Disparity

Differing regions of a country can have quite different economic outcomes, as well as different sectors. For example, we can track provincial GDP in Canada. It is entirely possible that some provinces are in recession, while others are still expanding. Whether aggregate Canadian activity will decline is just a question of the weights of the provinces in the aggregate.

This means that indicators related to a particular industry may correctly call a regional recession, but miss on aggregate activity. Once again, we need to ask whether this is truly a false positive.

Financial Market Shenanigans

Many of these forecast models are highly reliant on inputs that are financial market variables. The correct way to interpret them is that the model is giving a probability of recession that is priced into markets -- under the strong assumption that market behaviour matches historical patterns.

If one is involved in financial markets, one needs to be mindful of circular logic. So long as we accept the assumption that behaviour matches previous norms, we should trade the markets involved using the indicator only if our personal view on recession odds differs from what is implied by the markets. The alternative is that we end up trend following we put on curve flatteners because the model says that there is high probability of recession – but that probability is based on the previous flattening of the curve.

Since I will be discussing the yield curve at length in Volume II, I will briefly comment on some other financial market indicators that are used.

Credit Markets. Various credit spreads -- particularly bank-based spreads -- are popular indicators. The question is whether we are extrapolating the experience of recent recessions forward (as noted earlier). It is entirely possible that the investment grade credit markets (which these indicators are usually based on) can sidestep a recession. For example, large banks have a great many mechanisms to manage the credit losses that they will expect from their small customers. (The United States with its branch banking model does have the property that any downturn will wipe out some small regional banks.) It takes a real team of chowderheads to put a bank into bankruptcy.

Equity Markets. Equity markets are supposed to be discounting an infinite stream of cash flows, not the next few months of activity. Theoretically, the effect of a recession on stock prices should be small. That said, equity markets do tend to swoon ahead of recessions. This makes sense if we believe that equity market pricing is closer to extrapolating the last few data points out to infinity. I am not the person to resolve that debate.

However, equity markets are volatile, and so periods of falling prices are expected to happen periodically. As a result, we should expect some false positive recession signals.

Finally, large equity indices are dominated by multinationals; it is unclear how much their prospects are tied to the domestic economy. Even if a country can avoid a downturn while the major economies are in recession, it is likely that its domestic equity indices will still fall in tandem with its overseas peers.

Commodities. One of the interesting regularities of U.S. recessions is that they tended to be preceded by oil price spikes. (Discussed in Section 5.4 of Interest Rate Cycles: An Introduction.) There might be other examples one could find.

The basic problem with using commodity prices as an indicator is the markets are global. It is possible that a country's cycle will be independent of the global cycle. Furthermore, China has been a major source of commodity demand, while its domestic economy is somewhat isolated from the rest of the world (if we dis-aggregate its export industries from the rest of the domestic economy).

However, for some commodity producers, commodity prices may be all we need to forecast recessions. The nominal income loss from a price fall may be enough to swamp any other factors in the economy.

Footnotes:

[i] According to an online source, “John C. Bluedorn et al.” attributed this quote to Paul Samuelson. Since I have seen that attribution multiple times, I have not validated this myself. URL: https://en.wikiquote.org/wiki/Paul_Samuelson

[ii] This impression has been created by condescending remarks given by youthful products of these graduate programmes.

(c) Brian Romanchuk 2019

1 comment:

Note: Posts are manually moderated, with a varying delay. Some disappear.

The comment section here is largely dead. My Substack or Twitter are better places to have a conversation.

Given that this is largely a backup way to reach me, I am going to reject posts that annoy me. Please post lengthy essays elsewhere.