Recent Posts

Wednesday, March 20, 2019

Inherent Limitations Of Linear Economic Models

Linear models used to be a popular methodology in economics, such as the log-linearisations of dynamic stochastic general equilibrium (DSGE) models. Rather than look at particular models, it is simpler to examine the properties of linear models themselves to see why they are inherently unable to capture key features of recessions, or rely on “unforecastable shocks” that represent an absence of theory about recessions. Since the author is unaware of anyone putting forth linear models as being useful in this context, this discussion is kept brief, and is perhaps only of historical interest. For example, if one wants to examine why the Financial Crisis acted as a theoretical shock, we need to understand how it conflicted with the popular linear models of the time.

Note: This is an unedited draft of a section that will go into a chapter describing neoclassical theory in a book on recessions. The text refers to chapters and discussions that are not yet published. 

UPDATE Parts of this article will need a severe re-write. My belief when I wrote this article was that the Jordan canonical form was a standard part of every undergraduate linear algebra text. I engaged in hand-waving, under the assumption that anyone who knew about matrices would have once studied the Jordan canonical form. However, having consulted some introductory linear algebra texts in the local library, it has come to my attention that this is not the case. As such, the discussion of stability analysis needs to be taken out behind the barn and shot. To be clear, the assertions about the mathematics are correct, just that my attempt to explain it probably makes no sense to anyone who actually needs to read it to learn about the topic.

The only non-web reference I have for the Jordan canonical form is found in Chapter Six of L'Algèbre Linéaire Deuxième Édition, by Jacques Bouteloup, Presses Universitaires de France, 1971. 


Although my approach here is generic, I will refer to a pre-crisis benchmark DSGE model, which is described in the European Central Bank working paper “An estimated stochastic dynamic general equilibrium model of the euro area,” by Frank Smets and Raf Wouters (URL:, published in August 2002. When one opens the paper, one is most likely mesmerised by the highly complex equations of sections 2.1 to 2.3, and the mathematical jargon. However, the actual model that is analysed in the empirical work is the much simpler linearised model of section 2.4. With a small amount of algebra, we can convert that linearised model into the generic form described below.

Once we realise that this is actual model of interest, one wonders what all the fuss about DSGE macro is about. There are no optimisations, there is no mathematical entity corresponding to a representative household, equilibrium nowhere to be seen, etc. All those concepts appear in the back story in sections 2.1 to 2.3, and disappear once we jump to the linearised model. In other words, to the extent that models similar to the Smets-Wouters model failed, it had nothing to with those problems. Instead, the failure lies in the nature of linear models.

One problem with the formalism used by Smets-Wouters (and appeared to be common across the linearised DSGE model literature), is that matrix notation was avoided. Rather than write out equations with dozens of symbols, every other field of applied mathematics would just compress the equations into matrix algebra. This is far more compact, and it is easy to apply long-existing mathematical results to characterise the solution properties.

The downside with matrix algebra is the assumption that the reader is familiar with it. That said, I would guess that a reader that is unfamiliar with matrix algebra would most likely have difficulties deciphering the Smets-Wouters article as well. If the reader of this text is unfamiliar with matrix algebra, I apologise. Certain things I write may be unintelligible, but it should be possible to get a rough understanding of my arguments. Unfortunately, I no longer own any textbooks that justify my description herein; my characterisations of the stability properties of linear systems are stated without proof in every textbook that remains in my possession. However, this theory should be covered in most textbooks on linear algebra, or introductory control systems texts that cover state space techniques. (Historically, introductory texts focused on frequency domain approaches.)

Linear State Space Models

If we take the Smets-Wouters linear model, we see that we have equations that define the values of economic variables. At time t, they are given in terms of the values of the economic variables as well as external disturbances at time t+1, t, t-1. We stack all of these economic variables in a vector denoted v. Let n be the number of variables, v(t) is an element in a n-dimensional space of real numbers.

The dependence upon three points in time puts the set of equations outside the usual definitions of discrete time linear systems one would encounter in a linear systems textbook. However, we get around this by defining the state vector x(t) as being a vector of size 2n, created by stacking the original vector v(t), and its previous (lagged) value v(t-1).

Using some algebra (and shifting some equations by one time period, and assuming the inflation target is zero), we can convert equations (31)-(39) into the form of the canonical 2n-dimensional time invariant discrete time linear system:

x(t+1) = A x(t) + B d(t),

with x defined as above, and d being the vector of disturbances (with some disturbance series being time shifted to align to the canonical format)*, and A, B being appropriately sized matrices. In particular, A is a 2n×2n matrix. If the inflation target is non-zero, we need to convert to using a canonical discrete time feedback control system representation, with the interest rate being the feedback variable, and the inflation target an external reference value. The shift to the feedback control configuration has no real effect on the stability analysis, other than first having to embed the central bank feedback law into the closed loop system, and adjusting the B matrix to account for the reference variable (the inflation target).

Stability Analysis

We are interested in the stability properties of this system. From the perspective control systems, the stability of the system is the most important, trumping the effects of disturbances, for reasons that will be clear later. The system stability is entirely determined by the properties of matrix A; we can drop from consideration the matrix B. That is, we just look at the system:

x(t+1) = Ax(t), x(0) = x_0.

In English, the state variable starts at time zero at an initial condition x_0 and then evolves by successively multiplying the vector by the matrix A. That is, x(1) = A x_0, x(2) = A^2 x_0, etc.

We can then appeal to one of the crowning achievements of practically every introductory textbook on linear algebra: we can express every vector in the state space (in this case, which is 2n-dimensional) as combination of eigenvectors and (unfortunately) generalised eigenvectors. For simplicity, I will skip the generalised eigenvectors, and hedge my statements about stability slightly to compensate.

The (non-zero) vector y is an eigenvector of the matrix A, with an associated eigenvalue λ if:
Ay = λy.

Why do we care? For the vector y, we can replace the matrix multiplication by a scalar multiplication. That is, if the initial state is y, the next state is λy, the next is λ^2y, etc. As can be seen, if λ is a real number greater than 1, the solution will march off to infinity (since y is assumed to be non-zero).

Since we know we can decompose any vector in the state space to be a linear combination of eigenvectors (and generalised eigenvectors) courtesy of the results in said linear algebra textbook, we can express the trajectory of any initial condition to be a linear combination of the trajectories generated by starting at the eigenvectors/generalised eigenvectors.

One complication to note is that the eigenvalues of a real matrix can contain pairs of complex numbers (complex conjugates). (A complex number is a number that has both a real part, and an imaginary part, where the imaginary part is a real number times the square root of -1. The square root of -1 is normally denoted i, but the systems engineering literature often follows the electrical engineering convention of using j for that concept. The explanation is that i is reserved for variables representing currents in electrical circuit analysis.) We can define stability concepts as follows.
  • The matrix A is strictly stable if the modulus of all eigenvalues has a magnitude strictly less than 1.
  • The matrix A is strictly unstable is any eigenvalue has a modulus strictly greater than 1.
  • (If the maximum of the moduli equal 1 exactly, one needs to start worrying about those darned generalised eigenvectors in the state space decomposition).
What happens if the A matrix is strictly unstable? We will be able to find subspaces of the state space for which:
  1. If the eigenvalue is real, all initial conditions that start in that subspace will grow in an exponential fashion.
  2. If the eigenvalue is complex, there will be a subspace of dimension 2 in which real-valued vectors follow a trajectory defined by a sinusoid multiplies by an exponentially growing constant. This means that each component of the vector can be written in the form a^k sin(ωk + α), with a >1. (A complex eigenvector would grow by the complex eigenvalue, but a pair of such vectors will define a real sinusoid).
In plain English, it either blows up in a fashion like compounding interest, or is a sinusoid with a fixed frequency that is growing exponentially.

Once again, we can appeal to the argument we can express any vector in the state space as a linear combination of eigenvectors (and generalised eigenvectors), the system solution will have a component that is growing in an exponential fashion if the component corresponding to any unstable eigenvalue is non-zero. (There may be sub-components of the solution that are decaying exponentially as well.)

If we assume that the initial state is randomly distributed in a uniform fashion in the original state space, the probability of it having a component in the “unstable sub-space” is zero. That is, we will almost certainly see the trajectory eventually growing in an exponential fashion (“blowing up”).

We can now go back to our economic model. One thing to note is that these linearised models are not in terms of levels, they describe rates of change. If the system were strictly unstable, we would almost certainly see trajectories of variables like inflation (etc.), marching off to infinity. (Note that I am referring to the closed model, in which the central bank reaction function is specified.) Since we do not see that behaviour in the real world historical data for the euro area, we have to assume that the estimated model cannot be strictly unstable. (If we were trying to fit data that contains a hyperinflation, this would not apply.) If the system stability was on the knife edge (eigenvalues with a modulus of 1), the state variables would tend to form a random walk – drift in one direction or another in response to disturbances, without a tendency to revert to any particular level. If we look at the euro area data, that does not seem to be a plausible description either, as inflation has tended to remain near target. By implication, just a cursory glance at the data tells us that behaviour is consistent with a strictly stable system.

For the system with disturbances, systems theory tells us that strictly stable system will do a good job of rejecting those disturbances. There is a great deal of engineering intuition behind that argument; engineering systems are designed to be stable for a reason. Obviously, sufficiently large disturbances can knock any system around: no amount of control systems wizardry can overcome the forces generated by flying an aircraft into a cliff. However, if we only have moderate disturbances, there will be a temporary movement in the state variable from its steady state that will die down as the magnitude of the disturbance wanes. If the disturbance were random, we would expect it to wax and wane in this fashion.

This concept is typically formalised in systems theory by looking at the gain from the disturbance signal to the state variables, where the magnitude of signals is measured by its 2-norm: the room-mean-square of the time series. For a strictly stable system, the gain from disturbance to state variable is finite, but (roughly speaking) the gain will tend to infinity as the A matrix tends to the limit of being unstable.**

As such, it is difficult to generate the patterns of deviations seen in economic data with the usual probability distributions for disturbances: We see small, erratic movements during lengthy expansions (in the modern era, developed country expansions have often lasted around ten years), with short-lived massive deviations. If disturbances were normally distributed (for example), we should see a wider range of deviations than we see in the data – “half recessions” or “mini-boomlets” – every few years.

How Can We Generate Recessions?

There are two work-arounds to generate a trajectory that resembles a recession.
  1. Have some disturbances have a probability of being very large at widely separated points in time.
  2. Make the system matrices vary with time.
The first is the standard interpretation. The Financial Crisis was just hit by “m-standard deviation” “unforecastable shock”; which resembles the terminology used by some dumbfounded commentators at the time.

The second variant – making the system parameters vary with time – has some similarities to the “unforecastable shocks” in that the deviations have to generate large changes in behaviour at widely-separated points in time. There does not appear to be a way to characterise the change in parameters, since the in-recession time sample is so small. It just turns into an arbitrary shift to the model that allows it to fit any change in trajectory, which is a non-falsifiable methodology.

It is hard to argue strongly against appealing to “random shocks” to explain wiggles in the data during an expansion. The issue is when we look at recessions. As noted above, we need to use questionable probability distributions to generate recessions; they just appear because we forced the model to generate similar behaviour. The wide separation of recessions means that we cannot really hope to fit the tail of the probability distribution with any degree of confidence. However, that statistical argument is secondary (and explains why I have not spent time analysing it any detail). As seen in my analysis of post-Keynesian theories (note: which will appear in earlier chapters of the book, and only partly completed at the time of writing), there are predicted empirical regularities for recessions that show up in the data, such as debt buildup or fixed investment trends. If random shocks were truly generating recessions, they should happen at any time, and have no related empirical regularities (other than the footprint of the shock itself in the modelled time series). These models have literally nothing to say about recessions, other than non-forecastability (admittedly, my own view).

These pre-Financial Crisis models have been heavily lambasted in heterodox academic publications, as well as in popular accounts. Meanwhile, the neoclassicals do not defend them particularly loudly (although actual admissions of the heterodox scholars being correct are rather thin on the ground). Therefore, I see no reason to explain the model weaknesses any further, at least in the context of recessions (which is what I am interested in). These models are perhaps defensible in the context of thinking about decision rules for inflation-targeting central banks, but applications beyond that are less evident. Furthermore, the burying of a simple linear model under DSGE mumbo-jumbo is inexcusable. The only real usefulness of the optimising mathematics is to distract the readers from the actual models. This distraction may have made matters worse, since people may have attributed properties to the models that simply did not exist.

Concluding Remarks

The take-away from my arguments is straightforward: any linear model is going to be inadequate for explaining recessions, no matter what backstory is used to generate the linear model. If we want to extract any useful theory from neoclassical research, we need to roll up our sleeves, and work with the nonlinear models.


* Since they are exogenous variables, time shifts of disturbances have no effect on the solution beyond labelling issues.

** That statement is not easily “proven,” since it is hard to define the concept of the matrix “tending to” instability. What is easy to demonstrate is that the gain is unbounded if the system is no longer stable. Also, we can fix particular systems with a free parameter, and show that the gain tends to infinity as the parameter value hits the limit of stability. This is much easier to do in the frequency domain, which is out of scope of this discussion.

(c) Brian Romanchuk 2019

No comments:

Post a Comment

Note: Posts are manually moderated, with a varying delay. Some disappear.

The comment section here is largely dead. My Substack or Twitter are better places to have a conversation.

Given that this is largely a backup way to reach me, I am going to reject posts that annoy me. Please post lengthy essays elsewhere.