WEBVTT - autoGenerated
00:00:00.000 --> 00:00:28.000
Thank you.
00:00:28.000 --> 00:00:50.000
Good morning and welcome to the last lecture on time series econometrics.
00:00:50.000 --> 00:01:01.000
May I ask you if there are any questions, and please raise your hand.
00:01:01.000 --> 00:01:11.000
There are apparently no questions, I have projected, or there is a question, yes.
00:01:11.000 --> 00:01:15.000
What would be the suggested topics for the exam.
00:01:15.000 --> 00:01:22.000
Anything which I covered in the lecture is a potential topic for the exam, so I'm unable
00:01:22.000 --> 00:01:26.000
to give you any more clues on that.
00:01:26.000 --> 00:01:39.000
Yeah, as I say, I mean, anything that I covered in the lecture is potentially an exam question.
00:01:39.000 --> 00:01:42.000
Any other questions?
00:01:42.000 --> 00:01:55.000
Okay, as I said, I project here again the last slide of last lecture, namely the definition
00:01:55.000 --> 00:01:57.000
of cointegration.
00:01:57.000 --> 00:02:06.000
You recall that we discussed some examples of cointegration and non-cointegration, basically
00:02:06.000 --> 00:02:14.000
examples which were set up with a vector y, t, which had dimension two, so essentially
00:02:14.000 --> 00:02:21.000
in the y, t vector there were two variables, and either these two variables were cointegrated
00:02:21.000 --> 00:02:25.000
with each other, or they were not.
00:02:25.000 --> 00:02:30.000
And we discussed these examples, but clearly they were just examples, in general y, t is
00:02:30.000 --> 00:02:41.000
a vector of k variables, and as we will see now, this higher dimensionality of the dependent
00:02:41.000 --> 00:02:49.000
variable gives rise to the possibility that there is more than just one cointegrated vector.
00:02:49.000 --> 00:02:55.000
But here, as I say, I first repeat what we did at the very end of the last lecture, namely
00:02:55.000 --> 00:03:06.000
to define cointegration by saying that such a vector y, t, assuming it's to be integrated
00:03:06.000 --> 00:03:14.000
of order one, is called cointegrated if a linear combination of the components of the
00:03:14.000 --> 00:03:23.000
vector exists, and the weights for the linear cointegration are embodied in the vector beta,
00:03:23.000 --> 00:03:31.000
which I used to pre-multiply y, t with, so that I get a linear combination z, t as beta
00:03:31.000 --> 00:03:40.000
prime y, t, so the process is cointegrated if such a beta vector exists, with the property
00:03:40.000 --> 00:03:47.000
that the linear combination is i0, so is something which is stationary.
00:03:47.000 --> 00:03:57.000
So beta basically makes out of a i1 vector, a non-stationary vector, it transforms it
00:03:57.000 --> 00:03:59.000
into a stationary vector.
00:03:59.000 --> 00:04:07.000
And I hope you understood in the last lecture that the way by which this happens is that
00:04:07.000 --> 00:04:16.000
the i1 component, which is in the y, t vector, or beta, the i1 components, which we find
00:04:16.000 --> 00:04:24.000
in the components of the y, t vector, that they just cancel when the components of the
00:04:24.000 --> 00:04:33.000
y, t vector are being combined in a linear way, so that the stochastic trend in the series
00:04:33.000 --> 00:04:42.000
drops out, that one stochastic trend cancels against another stochastic trend, and then
00:04:42.000 --> 00:04:48.000
only the stationary components remain, so that the linear combination z, t is i0.
00:04:48.000 --> 00:04:55.000
In this case, we call beta a cointegrating vector.
00:04:55.000 --> 00:05:03.000
Now beta is usually meant to be a column vector, and this is why I denote it here with transposition.
00:05:03.000 --> 00:05:09.000
And it turns out, however, that beta is not a unique column vector, rather if we have
00:05:09.000 --> 00:05:17.000
one cointegrating vector, then it is always very easy to construct many other cointegrating
00:05:17.000 --> 00:05:23.000
vectors, because any vector lambda beta would also be a cointegrating vector for a nonzero
00:05:23.000 --> 00:05:25.000
lambda.
00:05:25.000 --> 00:05:32.000
Very clearly, if we multiply the stationary process z by something which is different
00:05:32.000 --> 00:05:37.000
from zero, then it remains something which is stationary.
00:05:37.000 --> 00:05:43.000
So that's a very easy and actually trivial way to find more cointegrating vectors if
00:05:43.000 --> 00:05:46.000
we have just a single cointegrating vector.
00:05:46.000 --> 00:05:56.000
But these are not very interesting vectors, because the multiplication with a scalar creates
00:05:56.000 --> 00:06:02.000
a whole space of cointegrating vectors, which are all linearly dependent.
00:06:02.000 --> 00:06:08.000
And clearly, this is not interesting, because this is just a scaling vector.
00:06:08.000 --> 00:06:13.000
It is possible to normalize the beta vector by setting its first coefficient, for instance,
00:06:13.000 --> 00:06:14.000
equal to 1.
00:06:14.000 --> 00:06:21.000
If I do that, define that, let's say, in the first component of the beta vector, there
00:06:21.000 --> 00:06:23.000
ought to be a 1.
00:06:23.000 --> 00:06:31.000
Then clearly, all those other linear, those other scalar multiplications of the cointegrating
00:06:31.000 --> 00:06:42.000
vector would not satisfy this normalization, so we would just not consider them.
00:06:42.000 --> 00:06:46.000
And now there comes something which is wrong on this slide, as I detected when I prepared
00:06:46.000 --> 00:06:49.000
for today's lecture.
00:06:49.000 --> 00:06:56.000
Even with this normalization, it should say here, beta is not unique.
00:06:56.000 --> 00:07:01.000
So what I wrote here initially with this normalization, beta is unique.
00:07:01.000 --> 00:07:07.000
This is true only in the case that there is just one linearly dependent cointegrating
00:07:07.000 --> 00:07:13.000
vector, which actually I had in mind when I wrote this slide.
00:07:13.000 --> 00:07:20.000
But in a more general setting where there are more than one cointegrating vectors, more
00:07:20.000 --> 00:07:26.000
than one linearly independent cointegrating vectors, then even this normalization would
00:07:26.000 --> 00:07:30.000
not yet yield a unique vector.
00:07:30.000 --> 00:07:37.000
And I will upload a approved slide where I correct the strong sentence down here.
00:07:37.000 --> 00:07:44.000
So please note already that this is in the more general framework in which I developed
00:07:44.000 --> 00:07:46.000
cointegration.
00:07:46.000 --> 00:07:53.000
Not correct, but actually this was only correct in the setting where we have just one linearly
00:07:53.000 --> 00:07:58.000
independent cointegrating vector.
00:07:58.000 --> 00:07:59.000
All right.
00:07:59.000 --> 00:08:05.000
Now why is cointegration important?
00:08:05.000 --> 00:08:14.000
Basically the huge appeal which cointegration exerted on researchers was due to the fact
00:08:14.000 --> 00:08:22.000
that it is not merely a statistical property, that a process is cointegrated, but that can
00:08:22.000 --> 00:08:30.000
be interpreted economically as evidence of a long run equilibrium relationship.
00:08:30.000 --> 00:08:38.000
So a cointegrating vector, if we find any, can be given an interpretation as let's say
00:08:38.000 --> 00:08:41.000
a steady state, as a long run equilibrium.
00:08:41.000 --> 00:08:49.000
Because it tells us that two variables or two components of the y vector, that they
00:08:49.000 --> 00:08:52.000
move closely together in time.
00:08:52.000 --> 00:08:58.000
And if two variables move closely together over time, then very clearly it seems that
00:08:58.000 --> 00:09:04.000
there is some relationship between those two variables, which is reasonably interpreted
00:09:04.000 --> 00:09:07.000
as the long run equilibrium.
00:09:07.000 --> 00:09:16.000
So cointegration tells us that even though we have certain components of the y vector,
00:09:16.000 --> 00:09:23.000
which let's say we have two such components, which are both I1, and we know from the properties
00:09:23.000 --> 00:09:31.000
of I1 processes that they can drift to any point in the space of real numbers, that they
00:09:31.000 --> 00:09:36.000
are not stationary, that they do not fluctuate around a constant mean, but that the probability
00:09:36.000 --> 00:09:42.000
of them assuming any given value is non-zero.
00:09:42.000 --> 00:09:51.000
So even though such variables, which are I1, can drift more or less freely through space,
00:09:51.000 --> 00:09:59.000
cointegration would tell us that two variables share the same drift, and therefore they cling
00:09:59.000 --> 00:10:01.000
together over time.
00:10:01.000 --> 00:10:04.000
They will not move far apart from each other.
00:10:04.000 --> 00:10:08.000
And this is why we can interpret this as an equilibrium relationship, and this is why
00:10:08.000 --> 00:10:15.000
we can use this to test models, which have the implication that there is a long run equilibrium
00:10:15.000 --> 00:10:20.000
between certain variables, let's say there's a steady state in this model, which implies
00:10:20.000 --> 00:10:25.000
that variables have a certain relationship to each other.
00:10:25.000 --> 00:10:32.000
We can interpret such models in time series analysis as representations of cointegrated
00:10:32.000 --> 00:10:37.000
vectors, or as the reason why certain variables are cointegrated.
00:10:37.000 --> 00:10:46.000
Now, when we have such variables which share the same drift, then this does not say that
00:10:46.000 --> 00:10:50.000
these two variables are always equal to each other, or that the difference between the
00:10:50.000 --> 00:10:51.000
two is zero.
00:10:51.000 --> 00:10:59.000
Rather, what we know is that the difference, or let's say the linear combination of the
00:10:59.000 --> 00:11:10.000
two variables which makes the stochastic trends cancel, that these deviations are stationary.
00:11:10.000 --> 00:11:17.000
So beta prime yt, the cointegrating vector multiplied by the process yt, is something
00:11:17.000 --> 00:11:24.000
stationary, so it fluctuates around the constant mean, and often around zero.
00:11:24.000 --> 00:11:33.000
So we can interpret the beta prime yt as stationary deviations from a long run equilibrium.
00:11:33.000 --> 00:11:38.000
And this will give rise to something which we call an error correction mechanism, so
00:11:38.000 --> 00:11:45.000
the economic interpretation will be that we have a certain equilibrium, and that this
00:11:45.000 --> 00:11:52.000
equilibrium holds in the long run, but in the short run, we perhaps never have an exact
00:11:52.000 --> 00:11:55.000
equilibrium on markets.
00:11:55.000 --> 00:12:00.000
Some periods there is an excess of supply, and some periods there is an excess of demand,
00:12:00.000 --> 00:12:02.000
and that is being corrected over time.
00:12:02.000 --> 00:12:11.000
So the excess demand fluctuates around the equilibrium level, it fluctuates around zero.
00:12:11.000 --> 00:12:15.000
Sometimes excess demand is positive, sometimes excess demand is negative, there are certain
00:12:15.000 --> 00:12:21.000
corrections taking place over time, but never do these deviations from the long run equilibrium
00:12:21.000 --> 00:12:28.000
level last for a very long time.
00:12:28.000 --> 00:12:34.000
Therefore they are just temporary deviations from a long run equilibrium.
00:12:34.000 --> 00:12:40.000
Now let's look at an economic example by looking at the permanent income hypothesis, as you
00:12:40.000 --> 00:12:48.000
probably know, which was developed by Friedman, Milton Friedman, in 1957.
00:12:48.000 --> 00:12:54.000
So what Friedman, who was completely unaware of the concept of cointegration, didn't exist
00:12:54.000 --> 00:13:04.000
at his time, what Friedman postulated was a departure from Keynesian assumption theory
00:13:04.000 --> 00:13:13.000
by arguing that the income of a household or aggregate income, depending on the perspective,
00:13:13.000 --> 00:13:19.000
we can either formulate this on the macro level with aggregate income or on the micro
00:13:19.000 --> 00:13:26.000
level with the income of a household, that this income stream over time, which is given
00:13:26.000 --> 00:13:35.000
by YT, that this can be decomposed into a permanent component, YTP, so a certain stream of income
00:13:35.000 --> 00:13:42.000
of which the household thinks that this is a rather safe stream of income, something
00:13:42.000 --> 00:13:45.000
the household earns any time.
00:13:45.000 --> 00:13:50.000
So we will think of this perhaps as an I1 process.
00:13:50.000 --> 00:14:00.000
And a second component, which is a transitory component, YTT for transitory, which is stationary.
00:14:00.000 --> 00:14:08.000
So basically YT measured income consists of something which is the core income, the income
00:14:08.000 --> 00:14:17.000
one can safely expect to receive, even though it is a stochastic process, it is something
00:14:17.000 --> 00:14:27.000
which is permanent, and a stationary component, YTT, which is like an error term, basically
00:14:27.000 --> 00:14:33.000
for permanent income, sometimes transitory income is positive, sometimes transitory income
00:14:33.000 --> 00:14:36.000
may be negative.
00:14:36.000 --> 00:14:42.000
Now this is the case that permanent income, something which is unobservable, if permanent
00:14:42.000 --> 00:14:50.000
income is I1, and temporary income is stationary, then the sum of the two is obviously also
00:14:50.000 --> 00:14:51.000
I1.
00:14:51.000 --> 00:14:58.000
So we would have as the first implication of this type of theory that the I1 property
00:14:58.000 --> 00:15:06.000
from permanent income carries over to measured income, observable income YT.
00:15:06.000 --> 00:15:11.000
Now suppose that consumption is a function of permanent income.
00:15:11.000 --> 00:15:16.000
This is Friedmann's hypothesis, and it has later been used by other authors like Hall
00:15:16.000 --> 00:15:19.000
in 1978.
00:15:19.000 --> 00:15:25.000
And possibly consumption is also a function of transitory income, which has been emphasized
00:15:25.000 --> 00:15:29.000
by authors like Campbell and Menkew or Pleyman.
00:15:29.000 --> 00:15:33.000
Then we would have a consumption function of the following form.
00:15:33.000 --> 00:15:43.000
Consumption is equal to what we have as permanent income, plus some share beta of the transitory
00:15:43.000 --> 00:15:44.000
income.
00:15:44.000 --> 00:15:50.000
Note that I use the symbol beta here, and it does not denote any more the integrating
00:15:50.000 --> 00:15:54.000
vector, it's just a parameter.
00:15:54.000 --> 00:16:00.000
So consumption would be the income which we have as permanent income, and then we have
00:16:00.000 --> 00:16:05.000
the transitory income, and as part of the transitory income we do consume, and other
00:16:05.000 --> 00:16:13.000
parts we save, and thereby we save them for later consumption.
00:16:13.000 --> 00:16:18.000
Now when we subtract these two equations, so when we look at the difference between
00:16:18.000 --> 00:16:25.000
YT and CT, then obviously the difference is 1 minus beta times the transitory income.
00:16:25.000 --> 00:16:34.000
If we use these two equations, 5 and 6, and look now at the difference between YT and
00:16:34.000 --> 00:16:43.000
CT, both of which are observable quantities, then the permanent component would cancel.
00:16:43.000 --> 00:16:50.000
And here we would have 1 times transitory income minus beta times transitory income,
00:16:50.000 --> 00:16:57.000
so we would have the implication that YT minus CT is 1 minus beta times YTT.
00:16:57.000 --> 00:17:06.000
We should also note that consumption is of course a I1 process, because permanent income
00:17:06.000 --> 00:17:12.000
was assumed to be I1, and transitory income was assumed to be stationary.
00:17:12.000 --> 00:17:16.000
So we have the sum here of something which is I1, and something which is I0.
00:17:16.000 --> 00:17:21.000
We know that the sum of something which is I1, and something which is stationary, is
00:17:21.000 --> 00:17:28.000
again something which is I1, so CT would also be integrated, like YT.
00:17:28.000 --> 00:17:34.000
So we have two processes here, two observable processes with a stochastic trend, YT and
00:17:34.000 --> 00:17:36.000
CT.
00:17:36.000 --> 00:17:43.000
Now we find that the difference between these two I1 processes is stationary, because YT
00:17:43.000 --> 00:17:48.000
transitory is stationary, times 1 minus beta is just a scalar.
00:17:48.000 --> 00:17:56.000
So we would immediately see that this theory implies that the difference between YT and
00:17:56.000 --> 00:18:03.000
CT is stationary, and therefore income and consumption are cointegrated.
00:18:03.000 --> 00:18:11.000
So we have a cointegrating vector of 1 and negative 1, 1 for YT, negative 1 for CT.
00:18:11.000 --> 00:18:17.000
Often this type of theory is formulated in terms of the logs of variables.
00:18:17.000 --> 00:18:22.000
So if this is the log of Y and this is the log of consumption, then this would basically
00:18:22.000 --> 00:18:30.000
tell us that the log of the consumption ratio is stationary.
00:18:30.000 --> 00:18:34.000
So it's easy to test this.
00:18:34.000 --> 00:18:43.000
Essentially, the theory tells us that income and consumption would be two I1 series, which
00:18:43.000 --> 00:18:50.000
share the same stochastic trend, namely the stochastic trend, which is the unobservable
00:18:50.000 --> 00:18:54.000
hypothesized variable, permanent income.
00:18:54.000 --> 00:19:04.000
If we look at time series data, I use indexed series here of GDP and consumption, and there's
00:19:04.000 --> 00:19:10.000
some period where both indices are just set at the same value.
00:19:10.000 --> 00:19:15.000
I don't recall which one it was, but it may well be at the beginning of the sample that
00:19:15.000 --> 00:19:20.000
both variables were normalized to have the same value.
00:19:21.000 --> 00:19:30.000
Then we now see that consumption and income fluctuate together over time.
00:19:30.000 --> 00:19:34.000
So they are very close to each other over time.
00:19:34.000 --> 00:19:41.000
So that is quite easy to imagine that deviations, which we see, which are sometimes positive
00:19:41.000 --> 00:19:49.000
and sometimes negative, that deviations between consumption and GDP, or GDP and consumption,
00:19:49.000 --> 00:19:51.000
these deviations are stationary.
00:19:51.000 --> 00:19:58.000
This remains to be tested, but visually, it seems that both series have the same type
00:19:58.000 --> 00:20:01.000
of trending behavior.
00:20:01.000 --> 00:20:07.000
It is clearly not really a linear trend because there are certain time periods where a linear
00:20:07.000 --> 00:20:14.000
trend would have a rather steep slope, like, for instance, during this time here.
00:20:14.000 --> 00:20:21.000
And then here with the end of the 1960s, beginning of the 1970s, the slope is much less.
00:20:21.000 --> 00:20:26.000
So a linear trend is probably not a very good model for this time series here, but it may
00:20:26.000 --> 00:20:33.000
well be a stochastic trend, and then it seems that consumption and GDP are the same stochastic
00:20:33.000 --> 00:20:34.000
trend.
00:20:34.000 --> 00:20:43.000
You may ask, why did I use index numbers here rather than the normal data on GDP and consumption?
00:20:43.000 --> 00:20:48.000
Well, just for the simple reason that, of course, GDP is always, if we take the original
00:20:48.000 --> 00:20:55.000
data, greater than consumption because GDP is equal to consumption plus investment plus
00:20:55.000 --> 00:20:58.000
government expenditure plus external balance.
00:20:58.000 --> 00:21:05.000
So very clearly, the two series would not be on the same level with deviations, which
00:21:05.000 --> 00:21:12.000
fluctuate around zero, if I used the original numbers and a more complex model is necessary
00:21:12.000 --> 00:21:18.000
to account for investment and government and internal balance.
00:21:18.000 --> 00:21:23.000
But the principle we see already here when we just normalize the data in this way and
00:21:23.000 --> 00:21:30.000
thereby fit them to the very simple framework of Friedman's original article, where there
00:21:30.000 --> 00:21:35.000
is no investment, no government, no external sector, just there's income and there's consumption
00:21:35.000 --> 00:21:38.000
and there's some theory of how consumption is being determined.
00:21:38.000 --> 00:21:43.000
And then it would actually imply that the two series lie as close as they do in this
00:21:43.000 --> 00:21:47.000
graph when I use index numbers.
00:21:47.000 --> 00:21:49.000
Good.
00:21:49.000 --> 00:21:59.000
Any questions here, then please raise your hand.
00:21:59.000 --> 00:22:00.000
Apparently not.
00:22:00.000 --> 00:22:07.000
So what we do next is that we discuss error correction models.
00:22:07.000 --> 00:22:13.000
And doing this is completely analogous to what we have already done with univariate
00:22:13.000 --> 00:22:21.000
series, in particular, when we prepared the Augmented Tiki Fuller test, for which we will
00:22:21.000 --> 00:22:32.000
also develop some type of analog later on test for cointegration, basically.
00:22:32.000 --> 00:22:41.000
So we start with the analysis of a general VARP process and later move to the question
00:22:41.000 --> 00:22:45.000
on how we can test for cointegration.
00:22:45.000 --> 00:22:59.000
So for the existence of a combination of weights such that a I1 process YT can be given a cointegrating
00:22:59.000 --> 00:23:06.000
equilibrium relationship in the sense that beta prime YT is stationary.
00:23:06.000 --> 00:23:11.000
We start out, as I say, with a general VARP process.
00:23:11.000 --> 00:23:19.000
So we assume that YT as vector, yeah, as a vector, as a k-dimensional vector, can be
00:23:19.000 --> 00:23:29.000
described by a vector autoregression with a lag polynomial, a matrix valued lag polynomial
00:23:29.000 --> 00:23:38.000
in the lag operator A of L, such that A of LYT is just a constant, of course also a k-dimensional
00:23:38.000 --> 00:23:41.000
vector, plus white noise.
00:23:41.000 --> 00:23:50.000
And A of L, as you know, we can write somewhat more publicly as identity matrix minus a sum
00:23:50.000 --> 00:23:58.000
over P components of matrices AI times the lag operator to the power of I.
00:23:58.000 --> 00:24:05.000
So A of L is just a shorthand for this term in parentheses here.
00:24:05.000 --> 00:24:12.000
We can also equivalently write the same process as YT is equal to constant nu plus a1YT minus
00:24:12.000 --> 00:24:21.000
1 plus a2YT minus 2 plus plus plus APYT minus B plus white noise UT.
00:24:21.000 --> 00:24:26.000
Let us assume now that this process here is not necessarily stationary.
00:24:26.000 --> 00:24:34.000
So we have in the analysis of stationary VAR processes always assume that all the roots
00:24:34.000 --> 00:24:42.000
of the determinant of A of Z equal to 0, so of this equation here, all the roots of this
00:24:42.000 --> 00:24:47.000
determinant equation are outside the unit circle.
00:24:47.000 --> 00:25:00.000
Now we will slightly or perhaps greatly modify this assumption, assume a weaker fat or weaker
00:25:00.000 --> 00:25:06.000
circumstances in the sense that we say either all the roots are outside the unit circle,
00:25:06.000 --> 00:25:11.000
which would imply stationarity, or they are exactly one.
00:25:11.000 --> 00:25:15.000
So we allow for unit roots in this process.
00:25:15.000 --> 00:25:22.000
As I have pointed out in the last lecture, and as I find it important to repeat here,
00:25:22.000 --> 00:25:31.000
the vector process YT can be a process which is I1 in some components and I not stationary
00:25:31.000 --> 00:25:32.000
in other components.
00:25:32.000 --> 00:25:38.000
So it is completely possible to put in such a process here some components which are likely
00:25:38.000 --> 00:25:45.000
I1, like for instance GDP and consumption and investment, and other components which
00:25:45.000 --> 00:25:49.000
are likely not I1, like for instance the interest rate.
00:25:49.000 --> 00:25:52.000
The interest rate is typically a stationary process.
00:25:52.000 --> 00:25:59.000
There's no problem with combining both I1 series and I not series in one vector YT.
00:25:59.000 --> 00:26:07.000
In this case, we would have some roots of the determinant equation equal to 1 for the
00:26:07.000 --> 00:26:12.000
in stationary processes, for the integrated processes, consumption, GDP, investment and
00:26:12.000 --> 00:26:19.000
so forth, and some roots outside the unit circle for those components of YT, let's say
00:26:19.000 --> 00:26:24.000
the interest rate which are stationary.
00:26:24.000 --> 00:26:30.000
The question which we would like to pose now is how can we distinguish a co-integrated
00:26:30.000 --> 00:26:35.000
bar from a bar which is not co-integrated?
00:26:35.000 --> 00:26:43.000
So this here is just general description of a VAR, but we do not know whether this VAR
00:26:43.000 --> 00:26:47.000
is co-integrated or is not co-integrated.
00:26:47.000 --> 00:26:53.000
We do not know if a linear combination of the YTs exists in such a way that the linear
00:26:53.000 --> 00:26:55.000
combination is stationary.
00:26:56.000 --> 00:27:02.000
YT is in stationary, but a linear combination may exist which renders the linear combination
00:27:02.000 --> 00:27:06.000
stationary, and then the process would be co-integrated.
00:27:06.000 --> 00:27:13.000
So the key question which I address now is when and under which circumstances and under
00:27:13.000 --> 00:27:20.000
which properties which the process has, do we have a co-integrated VAR and under which
00:27:20.000 --> 00:27:27.000
other circumstances is the VAR not co-integrated?
00:27:27.000 --> 00:27:32.000
Now what I do here is a lot of formula, but it is not as complicated as you may think
00:27:32.000 --> 00:27:37.000
it is, because it is actually exactly what we have already done in the one-dimensional
00:27:37.000 --> 00:27:47.000
case, just rewriting of the VAR process by transforming it as much as possible in first
00:27:47.000 --> 00:27:50.000
differences.
00:27:50.000 --> 00:27:55.000
So recall that we did something like this already in the one-dimensional case.
00:27:55.000 --> 00:28:04.000
We have YT is equal to nu plus A1YT minus 1 plus A2YT minus 2 plus plus plus, so no
00:28:04.000 --> 00:28:06.000
change up to here.
00:28:06.000 --> 00:28:18.000
And now I have done a change by writing minus AP delta YT minus P plus 1.
00:28:18.000 --> 00:28:26.000
And this delta YT minus P plus 1, if you write this in levels, is of course YT minus
00:28:26.000 --> 00:28:39.000
P plus 1 minus YT minus P. So minus YT minus P times minus AP gives plus AP times YT minus
00:28:39.000 --> 00:28:46.000
P. So this would give us exactly this term here, plus AP YT minus P. But what I have
00:28:46.000 --> 00:28:56.000
done here is that I have also subtracted minus AP times YT minus P plus 1.
00:28:56.000 --> 00:29:04.000
So since I have subtracted this here, I have to add it here, plus AP YT minus P plus 1.
00:29:04.000 --> 00:29:10.000
So nothing has changed except the way of representing the process.
00:29:10.000 --> 00:29:17.000
I have one first difference now here, and here the coefficient on the leg level has
00:29:17.000 --> 00:29:23.000
changed to account for the transformation I have done here.
00:29:23.000 --> 00:29:29.000
So I have a new coefficient matrix here, and I can now do exactly the same operations for
00:29:29.000 --> 00:29:37.000
the YT minus P plus 1 and transform the YT minus P plus 1 in a delta YT minus P plus
00:29:37.000 --> 00:29:42.000
2 with this coefficient matrix here.
00:29:42.000 --> 00:29:54.000
By adding, of course, to the next YT minus P plus 2 here, this term times or these matrices
00:29:54.000 --> 00:30:01.000
times YT minus P plus 2, so that again, the coefficient matrix of the previous level has
00:30:01.000 --> 00:30:06.000
been changed, and I have another first difference here.
00:30:06.000 --> 00:30:11.000
So I have this here, the first difference, I have the first difference here, and I can
00:30:11.000 --> 00:30:16.000
move on like this to also get first differences everywhere.
00:30:16.000 --> 00:30:25.000
It turns out that then I have to sum all the AI matrices in front of, in this case here,
00:30:25.000 --> 00:30:28.000
the YT minus 2.
00:30:28.000 --> 00:30:35.000
Or if I also transform the YT minus 2 into first differences, which I have done here,
00:30:35.000 --> 00:30:41.000
I get here the sum over all of the coefficient matrices AI.
00:30:41.000 --> 00:30:45.000
So this here is an important line in my line of transformations.
00:30:45.000 --> 00:30:57.000
I have YT, the current level of the process is equal to a constant, plus last periods
00:30:57.000 --> 00:31:05.000
levels, YT minus 1, multiplied by the sum of all the coefficient matrices.
00:31:05.000 --> 00:31:11.000
And then everything else which follows are just first differences, thetas, thetas, thetas,
00:31:11.000 --> 00:31:13.000
with certain coefficient matrices.
00:31:13.000 --> 00:31:19.000
We're not so interested in the exact value of these coefficient matrices here.
00:31:19.000 --> 00:31:26.000
I just called them A1 bow, right, or AP minus 1 bow here.
00:31:26.000 --> 00:31:33.000
So some new type of coefficient matrix, but I note that the sum of all the AI matrices
00:31:33.000 --> 00:31:42.000
here is nothing else but the identity matrix minus A of 1, YT minus 1.
00:31:42.000 --> 00:31:49.000
A of 1, as you recall, is just the sum over all the coefficient matrices in the leg polynomial,
00:31:49.000 --> 00:31:57.000
including the identity matrix, which is why I have to add the identity matrix here.
00:31:57.000 --> 00:32:00.000
So I get this here.
00:32:00.000 --> 00:32:06.000
Now the interesting issue is we know something about the stationarity of the different components
00:32:06.000 --> 00:32:12.000
in the sum, because we know on the left-hand side we have something which is non-stationary.
00:32:12.000 --> 00:32:13.000
This is YT, right?
00:32:13.000 --> 00:32:17.000
We have an I1 process YT.
00:32:17.000 --> 00:32:20.000
So the left-hand side is non-stationary.
00:32:20.000 --> 00:32:23.000
On the right-hand side, we have lots of things which are stationary.
00:32:23.000 --> 00:32:31.000
UT is stationary, delta YT minus P plus 1 is stationary, all the delta YT minus something
00:32:31.000 --> 00:32:33.000
variables are stationary.
00:32:33.000 --> 00:32:36.000
So everything here is stationary.
00:32:36.000 --> 00:32:39.000
This is just a constant.
00:32:39.000 --> 00:32:43.000
But here is something which is non-stationary.
00:32:43.000 --> 00:32:45.000
And then again, things are very clear.
00:32:45.000 --> 00:32:51.000
If we add to something which is I1, lots of things which are stationary, then the result
00:32:51.000 --> 00:32:52.000
is I1.
00:32:52.000 --> 00:32:57.000
So YT on the left-hand side is I1.
00:32:57.000 --> 00:33:04.000
And this term here is I1, and therefore the equation is balanced, as it should be.
00:33:04.000 --> 00:33:10.000
Now what we can, however, do is that we also transform the left-hand side of this equation
00:33:10.000 --> 00:33:18.000
here to stationarity, because we have YT here, and we have identity matrix times YT minus
00:33:18.000 --> 00:33:20.000
1 here.
00:33:20.000 --> 00:33:29.000
So bring this term, YT minus 1 here, to the left-hand side of this equation.
00:33:29.000 --> 00:33:34.000
So subtract both sides of the equation, then we would get on the left-hand side of the
00:33:34.000 --> 00:33:39.000
equation, delta YT, this term here.
00:33:39.000 --> 00:33:45.000
And on the right-hand side of the equation, all that remains is negative A of 1, which
00:33:45.000 --> 00:33:49.000
is a matrix which in the literature is called pi, or for whatever reason.
00:33:49.000 --> 00:33:54.000
They chose pi as the symbol, rather than write negative A of 1.
00:33:54.000 --> 00:33:57.000
But pi is negative A of 1.
00:33:57.000 --> 00:34:06.000
So we get a representation which tells us delta YT is equal to nu plus some matrix pi,
00:34:06.000 --> 00:34:14.000
and this is just a matrix of fixed coefficients, times YT minus 1, plus lots of delta YT minus
00:34:14.000 --> 00:34:20.000
i's, plus white noise.
00:34:20.000 --> 00:34:28.000
Now can you tell me what is bizarre about this representation here, if you think of
00:34:28.000 --> 00:34:33.000
it in terms of integrated variables and stationary variables?
00:34:33.000 --> 00:34:38.000
If any idea, or do you see any type of problem or something which seems to be a problem in
00:34:38.000 --> 00:34:47.000
this representation?
00:34:47.000 --> 00:34:50.000
Think about it.
00:34:50.000 --> 00:34:55.000
Think about that I have just a couple of minutes ago argued, well, on the left-hand side, we
00:34:55.000 --> 00:35:00.000
have something which is i1, and on the right-hand side, we have something which is i1, plus
00:35:00.000 --> 00:35:03.000
lots of things which are stationary.
00:35:03.000 --> 00:35:11.000
So i1 plus lots of i0 remains i1, so we have i1 on the left-hand side and i1 on the right-hand
00:35:11.000 --> 00:35:14.000
side, and everything seemed to be fine.
00:35:14.000 --> 00:35:22.000
Now I have just done a tiny little bit of a transformation by moving identity matrix
00:35:22.000 --> 00:35:30.000
times YT minus 1 over to the other side, well, defining pi as negative A of 1, and I have
00:35:30.000 --> 00:35:39.000
this representation here, delta YT is equal to constant, plus pi times YT minus 1, plus
00:35:39.000 --> 00:35:46.000
sum of lots of terms which are stationary, plus white noise, which is also stationary.
00:35:46.000 --> 00:35:51.000
So where's the problem in this equation, or what makes this equation interesting?
00:35:51.000 --> 00:36:05.000
Which is actually the reason why I've highlighted it in yellow, if you have any idea on that.
00:36:05.000 --> 00:36:10.000
If anybody would like to speak out, then you may need some time to type it in, but please
00:36:10.000 --> 00:36:16.000
raise your hand to show me that you are typing something.
00:36:16.000 --> 00:36:22.000
If you do not raise your hand, then I assume you do not want to comment on that.
00:36:22.000 --> 00:36:27.000
I don't see any such indication, well, here the problem is the following, on the left-hand
00:36:27.000 --> 00:36:33.000
side of the equation we have something which is stationary, and on the right-hand side of
00:36:33.000 --> 00:36:39.000
the equation we have something which is not stationary, which is i1.
00:36:39.000 --> 00:36:46.000
I told you last week, i1 plus something which is stationary remains i1.
00:36:46.000 --> 00:36:54.000
So this whole sum here is stationary because there are just the deltas in there, delta wise.
00:36:54.000 --> 00:37:05.000
White noise is stationary, constant is just a constant, so basically our knowledge from
00:37:05.000 --> 00:37:12.000
last lecture would seem to tell us that the right-hand side of this equation here is i1,
00:37:12.000 --> 00:37:19.000
because it inherits the i1 property from the yt minus 1 here, which is i1.
00:37:19.000 --> 00:37:24.000
And adding lots of terms which are stationary doesn't change anything about that.
00:37:24.000 --> 00:37:30.000
So it seems to be the case that the right-hand side of this equation is i1, but the left-hand
00:37:30.000 --> 00:37:35.000
side of this equation is stationary, is i0, i of 0.
00:37:35.000 --> 00:37:37.000
So how can this be?
00:37:37.000 --> 00:37:42.000
This is the problem which you should understand, which you should think about.
00:37:42.000 --> 00:37:48.000
Suddenly the equation does not seem to be balanced anymore, because we have a stationary
00:37:48.000 --> 00:37:54.000
variable on the left-hand side and we have something which is not stationary on the right-hand
00:37:54.000 --> 00:37:59.000
side, since here is the level of the yt minus 1.
00:37:59.000 --> 00:38:05.000
And that cannot be the case, because we started out with an equation which was perfectly correct,
00:38:05.000 --> 00:38:06.000
obviously.
00:38:06.000 --> 00:38:11.000
All these reformulations here are correct, and you can convince yourself at home that
00:38:11.000 --> 00:38:13.000
they are correct.
00:38:13.000 --> 00:38:17.000
Even here we still saw in the last line that there is not really a problem, because we
00:38:17.000 --> 00:38:25.000
have i1 on the right-hand side and we have i1 on the left-hand side, with yt being i1
00:38:25.000 --> 00:38:27.000
on the left-hand side.
00:38:27.000 --> 00:38:32.000
So this equation seemed all right, and now we did a very tiny little change, suddenly
00:38:32.000 --> 00:38:37.000
there seems to be a problem, because this thing is stationary and this thing remains
00:38:37.000 --> 00:38:41.000
as not stationary.
00:38:41.000 --> 00:38:49.000
Now there are a number of solutions to my question, how we can explain this effect or
00:38:49.000 --> 00:38:59.000
how we can resolve the seemingly contradictory property of the equation 9, that we are stationary
00:38:59.000 --> 00:39:02.000
here and stationary there.
00:39:02.000 --> 00:39:06.000
In one way to resolve, this would of course be to say what perhaps pi is just equal to
00:39:06.000 --> 00:39:07.000
0.
00:39:07.000 --> 00:39:14.000
If the pi matrix is just equal to 0, then the i1 component here drops out and we would
00:39:14.000 --> 00:39:21.000
have delta yt is equal to nu plus the sum of lots of delta yt minus i's plus white noise,
00:39:21.000 --> 00:39:27.000
so we would have a VAR representation of order p minus 1 in first differences.
00:39:27.000 --> 00:39:29.000
And that's fine.
00:39:29.000 --> 00:39:36.000
So our hypothesis, perhaps you should just stay at this type of hypothesis, is that perhaps
00:39:36.000 --> 00:39:44.000
it is necessarily the case that the pi matrix here is equal to 0.
00:39:44.000 --> 00:39:48.000
It is not quite the right hypothesis, as I will tell you, but it is one way forward
00:39:48.000 --> 00:39:51.000
and we should look at it.
00:39:51.000 --> 00:39:58.000
It is not really a convincing hypothesis because we started out with a very general VAR and
00:39:58.000 --> 00:40:05.000
actually the a1, a2, up to ap matrices could be matrices, could be any type of matrices.
00:40:05.000 --> 00:40:12.000
When we say that the pi matrix is equal to 0, then we would have a restriction that the
00:40:12.000 --> 00:40:16.000
a of 1 matrix is 0 everywhere.
00:40:16.000 --> 00:40:23.000
And that can actually not be the case if all the a1, a2, and ap matrices are unrestricted.
00:40:23.000 --> 00:40:31.000
Why should their sum be equal to an identity matrix minus their sum?
00:40:31.000 --> 00:40:34.000
Why should this be equal to 0?
00:40:34.000 --> 00:40:36.000
There's no reason to see that.
00:40:36.000 --> 00:40:45.000
Still this is part of the resolution of the puzzle, that pi may perhaps be just a 0 matrix.
00:40:45.000 --> 00:40:53.000
Now, the other thing you should note is that this equation 9, which I just discussed, is
00:40:53.000 --> 00:40:58.000
the multivariate analog of the augmented Tiki-Fuller test equation.
00:40:59.000 --> 00:41:05.000
In the augmented Tiki-Fuller test equation, we made exactly the same type of algebraic
00:41:05.000 --> 00:41:17.000
transformations, introducing first differences everywhere in the autoregressive specification
00:41:17.000 --> 00:41:24.000
of the process such that at the end we just had one leg level with a coefficient which
00:41:24.000 --> 00:41:31.000
was called beta, which was, as we had shown, the sum of all the alpha coefficients, minus
00:41:31.000 --> 00:41:33.000
1.
00:41:33.000 --> 00:41:41.000
And then we tested the hypothesis that this beta here is equal to 0.
00:41:41.000 --> 00:41:51.000
So exactly the same thing we did now here with the pi matrix, so that the pi matrix
00:41:51.000 --> 00:41:58.000
is the analog to this parameter vector beta here.
00:41:58.000 --> 00:42:04.000
Remember please, or make clear to you, the beta here is a parameter.
00:42:04.000 --> 00:42:08.000
It has nothing to do with the cointegrating vector, which I also denoted to Peter a couple
00:42:08.000 --> 00:42:10.000
of slides earlier.
00:42:10.000 --> 00:42:13.000
So this has nothing to do yet with the cointegrating vector.
00:42:13.000 --> 00:42:21.000
It's just a shorthand notation for the sum of all the alpha i's minus 1 in the ADF test
00:42:21.000 --> 00:42:22.000
equation.
00:42:22.000 --> 00:42:28.000
And what we have derived in equation 9 is now some multivariate analog of this ADF
00:42:28.000 --> 00:42:29.000
test equation.
00:42:29.000 --> 00:42:37.000
If we think of the ADF test equation, then recall please that in the ADF test equation
00:42:37.000 --> 00:42:41.000
we tested if beta is equal to 0.
00:42:41.000 --> 00:42:50.000
So as I just said, pi matrix B equal to 0 would resolve our difficulty in interpreting
00:42:50.000 --> 00:42:52.000
equation 9.
00:42:52.000 --> 00:43:01.000
So it seems to be the case that this has something to do with pi being equal to 0, but not quite.
00:43:01.000 --> 00:43:05.000
And I will show you where the difference is here in the multivariate setting.
00:43:05.000 --> 00:43:13.000
Actually, we will see that the ADF test equation or that the way of testing for unit root in
00:43:13.000 --> 00:43:19.000
the augmented Dickey-Fuller test is just a special case of such a pi matrix here.
00:43:19.000 --> 00:43:26.000
We have so far only learned what we need to know in this special case, but now we come
00:43:26.000 --> 00:43:28.000
to the more general case.
00:43:28.000 --> 00:43:35.000
Anyway, in the ADF test equation, we tested if beta is equal to 0 or if beta is different
00:43:35.000 --> 00:43:36.000
from 0.
00:43:36.000 --> 00:43:40.000
If it was equal to 0, then we had a unit root.
00:43:40.000 --> 00:43:48.000
And if beta was different from 0, we did have a stationary process in the univariate case.
00:43:48.000 --> 00:43:53.000
And we will change this perspective slightly.
00:43:53.000 --> 00:44:01.000
I will say, actually, we can also think of this beta here as a matrix, namely a matrix
00:44:01.000 --> 00:44:05.000
with one column and one row, so a trivial matrix one by one.
00:44:05.000 --> 00:44:12.000
And if we think of beta as a matrix, then testing that beta is equal to 0 is the same
00:44:12.000 --> 00:44:18.000
thing as testing that the rank of this matrix beta is equal to 0.
00:44:18.000 --> 00:44:19.000
That's obviously the same thing.
00:44:19.000 --> 00:44:26.000
A scalar 0 is a one by one matrix with a rank of 0.
00:44:26.000 --> 00:44:32.000
And testing for beta being different from 0 is actually the same thing as testing whether
00:44:32.000 --> 00:44:36.000
the rank of beta is equal to 1.
00:44:36.000 --> 00:44:41.000
And there are just these two possibilities in the univariate case because beta is just
00:44:41.000 --> 00:44:51.000
a scalar, either taking the scalar as a matrix, either the rank of this matrix, of this trivial
00:44:51.000 --> 00:44:59.000
matrix is 0, or the rank of this trivial matrix is equal to 1.
00:44:59.000 --> 00:45:01.000
Only these two possibilities exist.
00:45:01.000 --> 00:45:08.000
Now, it turns out that in the multivariate case, which we discuss here and which we have
00:45:09.000 --> 00:45:16.000
summarized in equation 9, we are also interested in the rank of this matrix pi.
00:45:16.000 --> 00:45:25.000
So we're not so much interested in the fact whether all the entries of the pi matrix are
00:45:25.000 --> 00:45:30.000
equal to 0, even though this is one relevant case because this would imply that the rank
00:45:30.000 --> 00:45:33.000
of pi is equal to 0.
00:45:33.000 --> 00:45:38.000
But in general, we are not so much interested in the coefficients of the pi matrix, but rather
00:45:38.000 --> 00:45:41.000
in the rank of the pi matrix.
00:45:41.000 --> 00:45:46.000
And we can argue that this was actually what we were interested in when we had the augmented
00:45:46.000 --> 00:45:51.000
t cooler test because we were interested in whether the rank of beta was 0 or the rank
00:45:51.000 --> 00:45:54.000
of beta was 1.
00:45:54.000 --> 00:46:01.000
Now the easy case we have already mentioned, and I just repeat it here, if the rank of
00:46:01.000 --> 00:46:08.000
pi is equal to 0, well, then obviously pi is equal to 0.
00:46:08.000 --> 00:46:16.000
And if pi is equal to 0 in this equation here, then this term drops out completely.
00:46:16.000 --> 00:46:19.000
And we just have a VAR in first differences.
00:46:19.000 --> 00:46:27.000
Data yt has a representation of order p minus 1, right, with coefficient matrices a, i,
00:46:27.000 --> 00:46:28.000
bo.
00:46:28.000 --> 00:46:36.000
Yeah, all the data yt minus i, this is white noise, this is a constant, this term is 0.
00:46:36.000 --> 00:46:43.000
So when the rank of pi is equal to 0, then apparently pi is equal to 0.
00:46:43.000 --> 00:46:56.000
So we would see that yt is equal to yt, oh, yeah, I dissolved the first difference on
00:46:56.000 --> 00:47:02.000
the left-hand side of equation 9 now, which I should first show this to you.
00:47:02.000 --> 00:47:05.000
I argued, well, this thing here is equal to 0.
00:47:05.000 --> 00:47:10.000
So we have a VAR in first differences here.
00:47:10.000 --> 00:47:17.000
What I do now is that I recognize again that delta yt is the same thing as yt minus yt
00:47:17.000 --> 00:47:19.000
minus 1.
00:47:19.000 --> 00:47:23.000
And the term minus yt minus 1, I can move over to the right-hand side of the equation
00:47:23.000 --> 00:47:35.000
so that I get yt, no delta anymore, yt is equal to nu plus yt minus 1 plus some of first
00:47:35.000 --> 00:47:40.000
different yt minus i's plus white noise, right?
00:47:40.000 --> 00:47:46.000
So this is what I have done here, yt is equal to nu plus yt minus 1 plus lots of differences
00:47:46.000 --> 00:47:56.000
of the yt minus i's here with certain coefficient matrices a1 bo, a2 bo, a p minus 1 bo.
00:47:56.000 --> 00:48:02.000
So it would have a representation that yt is equal to nu plus yt minus 1 plus something
00:48:02.000 --> 00:48:04.000
which is stationary.
00:48:04.000 --> 00:48:10.000
We would have an i1 process here, we would have an i1 process here, plus something which
00:48:10.000 --> 00:48:17.000
is stationary remains an i1 process and the equation is balanced.
00:48:17.000 --> 00:48:24.000
This implies that yt has k statistic trends, as we will see a little later, right?
00:48:24.000 --> 00:48:28.000
But basically the intuition is perhaps already there.
00:48:28.000 --> 00:48:35.000
We have k dimensions or k components of the yt vector here, and for each component it
00:48:35.000 --> 00:48:43.000
is true that yit is equal to nu plus yit minus 1 plus something stationary.
00:48:43.000 --> 00:48:50.000
So there's a coefficient of 1 here, so the variable, the component of the yt vector has
00:48:50.000 --> 00:48:51.000
a unit root.
00:48:51.000 --> 00:48:57.000
And since there are k components of the yt vector, obviously we have a stochastic trends
00:48:57.000 --> 00:49:00.000
here.
00:49:00.000 --> 00:49:10.000
The interesting question is now what is the case if the rank of pi is between 0 and k?
00:49:10.000 --> 00:49:17.000
I have included here also the case of the rank of pi being equal to 0, right?
00:49:17.000 --> 00:49:23.000
But I have a strong inequality here, a rank of k is smaller than, rank of pi is smaller
00:49:23.000 --> 00:49:27.000
than capital K.
00:49:27.000 --> 00:49:35.000
Now we know that delta yt on the left-hand side of equation 9 is stationary, hence the
00:49:35.000 --> 00:49:41.000
right-hand side of equation 9 must also be stationary, right?
00:49:41.000 --> 00:49:46.000
So we know on the right-hand side of equation 9 that this term here, which involves all
00:49:46.000 --> 00:49:52.000
the first differences of the yt's is indeed stationary and white noise is also stationary,
00:49:52.000 --> 00:49:55.000
so this is stationary.
00:49:55.000 --> 00:50:03.000
The implication which we now have is that pi times yt minus 1 must be stationary.
00:50:03.000 --> 00:50:10.000
And this implies that the rows of the pi matrix must be co-integrating vectors, right?
00:50:10.000 --> 00:50:12.000
This is what we aim at.
00:50:12.000 --> 00:50:19.000
This pi matrix which we have established here seems to contain co-integrating vectors.
00:50:19.000 --> 00:50:28.000
Each row of the pi matrix multiplied by the yt minus 1 column vector here gives a linear
00:50:28.000 --> 00:50:34.000
combination of the yt's, of the yt minus 1's, right?
00:50:34.000 --> 00:50:42.000
Each row being multiplied by a column yt minus 1 here gives a linear combination and apparently
00:50:42.000 --> 00:50:47.000
this linear combination must be stationary because otherwise equation 9 would not be
00:50:47.000 --> 00:50:49.000
balanced.
00:50:49.000 --> 00:50:56.000
My question was how can it be that we have an I1 variable here while on the left-hand
00:50:56.000 --> 00:50:58.000
side there's something stationary?
00:50:58.000 --> 00:51:06.000
Well this can be the case if the rows of the yt minus 1 matrix are the co-integrating vectors
00:51:06.000 --> 00:51:09.000
for yt minus 1, right?
00:51:09.000 --> 00:51:13.000
Because then each row being multiplied by the column vector here gives us a stationary
00:51:13.000 --> 00:51:15.000
component.
00:51:15.000 --> 00:51:24.000
So the answer to my previous question was while it is true that yt minus 1 is I1, is
00:51:24.000 --> 00:51:34.000
in stationary, it is not necessarily true that pi times yt minus 1 is in stationary.
00:51:34.000 --> 00:51:38.000
Rather it can be that pi times yt minus 1 is stationary.
00:51:38.000 --> 00:51:44.000
And this is in particular the case if the row of the pi matrix multiplied by the yt
00:51:44.000 --> 00:51:53.000
minus 1 here gives us a co-integrated combination of the components of the y vector.
00:51:53.000 --> 00:51:59.000
Clearly if the pi matrix were zero, then we would have zero co-integrating vector, right?
00:51:59.000 --> 00:52:04.000
So if the rank of pi is equal to zero, then we have pi is equal to zero.
00:52:04.000 --> 00:52:08.000
So then there's just no co-integrating vector.
00:52:08.000 --> 00:52:14.000
Well if there's no co-integrating vector, then obviously we have k independent stochastic
00:52:14.000 --> 00:52:18.000
trends in the vector yt.
00:52:18.000 --> 00:52:25.000
It is then not possible that a single component of the yt vector is stationary.
00:52:25.000 --> 00:52:30.000
Because if it were, then it would be easy to have a trivial co-integrating vector.
00:52:30.000 --> 00:52:37.000
Let's suppose that the i-th component of the yt vector is stationary and all the other
00:52:37.000 --> 00:52:44.000
components of the yt vector are in stationary, then any vector would be a co-integrating
00:52:44.000 --> 00:52:50.000
vector which has zeros everywhere except in the i-th component.
00:52:50.000 --> 00:52:55.000
Because multiplying such a vector by yt would cancel out all the stochastic trends.
00:52:55.000 --> 00:52:57.000
They would be multiplied by zero.
00:52:57.000 --> 00:53:02.000
But in the i-th component there's something which is non-zero and therefore the i-th component
00:53:02.000 --> 00:53:11.000
would pick up a stationary time series and we would have a linear combination of the
00:53:11.000 --> 00:53:16.000
yt components which is stationary.
00:53:16.000 --> 00:53:21.000
But if pi is equal to zero, then apparently there is no co-integrating vector.
00:53:21.000 --> 00:53:26.000
So it must be that there is no stationary variable in the yt vector and that there are
00:53:26.000 --> 00:53:31.000
k independent stochastic trends.
00:53:31.000 --> 00:53:34.000
On the other hand, there are co-integrating vectors.
00:53:34.000 --> 00:53:40.000
If the variables are such that it is possible to find co-integrating combinations, then
00:53:40.000 --> 00:53:45.000
the rank of pi will be greater than zero.
00:53:45.000 --> 00:53:54.000
Which means that the rows of pi are not linearly independent since pi is singular.
00:53:54.000 --> 00:53:57.000
I haven't emphasized that but I should repeat it perhaps here.
00:53:57.000 --> 00:54:02.000
I have assumed the rank of pi is less than k.
00:54:02.000 --> 00:54:05.000
Pi is a matrix k by k.
00:54:05.000 --> 00:54:10.000
So it is non-singular if it has rank k.
00:54:10.000 --> 00:54:15.000
But here I consider the cases where the rank of the pi matrix is smaller than k.
00:54:15.000 --> 00:54:24.000
So pi is necessarily a singular matrix.
00:54:24.000 --> 00:54:29.000
We know that if the rank of pi is greater than zero, then apparently the pi matrix is
00:54:29.000 --> 00:54:31.000
not a zero matrix.
00:54:31.000 --> 00:54:37.000
So there is at least one row in the pi vector which has non-zero elements.
00:54:37.000 --> 00:54:45.000
Which means that this row defines a linear combination of the yt vector which is stationary.
00:54:45.000 --> 00:54:50.000
So this row of the pi vector gives us a co-integrating vector.
00:54:51.000 --> 00:54:59.000
There may actually be more than just one such row because we know some rows of pi must be
00:54:59.000 --> 00:55:02.000
linearly dependent on each other.
00:55:02.000 --> 00:55:04.000
This can be the zero rows.
00:55:04.000 --> 00:55:10.000
But suppose that there is no zero row, then we would know that at least two rows of the
00:55:10.000 --> 00:55:14.000
pi matrix are linearly dependent on each other.
00:55:14.000 --> 00:55:18.000
Or one row is linearly dependent on all the others.
00:55:18.000 --> 00:55:24.000
Since the matrix does not have full rank.
00:55:24.000 --> 00:55:30.000
Now there is a result from linear algebra which you may know, which you may recall.
00:55:30.000 --> 00:55:37.000
This result says that if we have a singular k by k matrix called pi, like we have done
00:55:37.000 --> 00:55:47.000
here, with the property that the rank of pi is equal to some r greater than zero.
00:55:47.000 --> 00:55:52.000
So r is something which is smaller than k because the matrix is singular.
00:55:52.000 --> 00:56:00.000
So the rank of the matrix pi is not k but is some r which is positive but smaller than
00:56:00.000 --> 00:56:01.000
k.
00:56:01.000 --> 00:56:08.000
Then we know that such a matrix pi can be decomposed as the product of two matrices which
00:56:08.000 --> 00:56:11.000
are called alpha and beta.
00:56:11.000 --> 00:56:15.000
So the notation here, the literature is always alpha and beta in this form.
00:56:15.000 --> 00:56:22.000
So I use it here also in the presentation of the lecture, but it is a little confusing
00:56:22.000 --> 00:56:28.000
because we would usually think that alpha and beta are either scalars or vectors.
00:56:28.000 --> 00:56:33.000
But in fact, they are meant to be matrices.
00:56:33.000 --> 00:56:38.000
Even though they are written with small letters, they are matrices.
00:56:38.000 --> 00:56:43.000
Namely both of them are k by r matrices.
00:56:43.000 --> 00:56:47.000
And these matrices have full rank both.
00:56:47.000 --> 00:56:52.000
Full rank means that they have rank r.
00:56:52.000 --> 00:56:58.000
Since r is smaller than k, the maximum rank that a matrix alpha or beta can have is rank
00:56:58.000 --> 00:57:00.000
r.
00:57:00.000 --> 00:57:05.000
So alpha has rank r and beta has rank r.
00:57:05.000 --> 00:57:07.000
Such matrices exist.
00:57:07.000 --> 00:57:10.000
We know this from linear algebra.
00:57:10.000 --> 00:57:15.000
And they have the property that pi can be written as alpha times beta prime.
00:57:15.000 --> 00:57:19.000
Like, we call alpha is k by r.
00:57:19.000 --> 00:57:22.000
So beta prime is r by k.
00:57:22.000 --> 00:57:26.000
So alpha times beta prime is k by k.
00:57:26.000 --> 00:57:31.000
And that's the format of the pi matrix which was also k by k.
00:57:31.000 --> 00:57:34.000
Okay.
00:57:34.000 --> 00:57:42.000
So we know, and we will make use of that extensively, that the pi matrix can be decomposed
00:57:42.000 --> 00:57:50.000
into two matrices which are of lesser rank, namely just of rank r, as alpha times beta
00:57:50.000 --> 00:57:51.000
prime.
00:57:51.000 --> 00:57:58.000
And actually, the matrices alpha and beta have less columns than the k matrix.
00:57:58.000 --> 00:58:01.000
The k matrix has, sorry, the pi matrix.
00:58:01.000 --> 00:58:05.000
The pi matrix has k columns.
00:58:05.000 --> 00:58:10.000
Alpha and beta have only k, have only r columns.
00:58:10.000 --> 00:58:16.000
They have k rows, like pi, but they have only r columns, right?
00:58:16.000 --> 00:58:22.000
And these are the linearly independent columns because the rank of alpha and beta is equal
00:58:22.000 --> 00:58:24.000
to r.
00:58:24.000 --> 00:58:29.000
So this means that we can rewrite equation nine.
00:58:29.000 --> 00:58:38.000
Now as data yt is equal to nu plus alpha beta prime times yt minus one plus something
00:58:38.000 --> 00:58:39.000
which is stationary.
00:58:39.000 --> 00:58:41.000
This is unchanged.
00:58:41.000 --> 00:58:49.000
I've just replaced the pi matrix, sorry, the pi matrix by alpha times beta prime.
00:58:49.000 --> 00:58:57.000
And the beta prime is now the matrix which defines the stationary linear combinations.
00:58:58.000 --> 00:59:03.000
The beta prime matrix defines the co-integrating vectors.
00:59:03.000 --> 00:59:11.000
And it defines r linearly independent co-integrating vectors because we know that the beta matrix
00:59:11.000 --> 00:59:14.000
has rank r.
00:59:14.000 --> 00:59:25.000
So all the columns in the beta matrix are independent of each other.
00:59:25.000 --> 00:59:30.000
Which means if all the columns in the beta matrix are independent of each other, it means
00:59:30.000 --> 00:59:36.000
that all the rows in the beta prime matrix are linearly independent, not independent
00:59:36.000 --> 00:59:39.000
of each other, but linearly independent.
00:59:39.000 --> 00:59:47.000
So all the rows in beta prime are linearly independent, which means that the beta prime
00:59:48.000 --> 00:59:57.000
matrix, the rows of the beta prime matrix constitute the co-integrating vectors of this
00:59:57.000 --> 01:00:01.000
process yt and all of them.
01:00:01.000 --> 01:00:08.000
All of these linear combinations which are just possible are represented by the beta
01:00:08.000 --> 01:00:11.000
prime matrix here.
01:00:11.000 --> 01:00:16.000
We will a little later discuss what the alpha matrix, what kind of significance it has.
01:00:16.000 --> 01:00:19.000
But the important matrix is actually the beta prime matrix here.
01:00:19.000 --> 01:00:26.000
The beta prime matrix is the matrix of co-integrating combinations, the matrix of co-integrating
01:00:26.000 --> 01:00:28.000
vectors.
01:00:28.000 --> 01:00:39.000
So, summing up, what we have shown is that if we have any k-dimensional value p with
01:00:39.000 --> 01:00:46.000
a property that the roots of the determinantal equations are either outside the unit circle
01:00:46.000 --> 01:00:49.000
or are exactly equal to 1.
01:00:49.000 --> 01:00:54.000
If this is the case, we have any such process, any such k-dimensional process for our p,
01:00:54.000 --> 01:01:03.000
then we know that if the rank of A of 1 is equal to 0, then there is no co-integration.
01:01:03.000 --> 01:01:10.000
They exist k independent stochastic trends, but we do not have co-integration.
01:01:10.000 --> 01:01:16.000
In this case, the first difference of this process, namely delta yt, has a var p minus
01:01:16.000 --> 01:01:19.000
1 representation.
01:01:19.000 --> 01:01:21.000
This was var p in the levels.
01:01:21.000 --> 01:01:26.000
The first difference is the var representation is p minus 1.
01:01:26.000 --> 01:01:29.000
This is the case when there is no co-integration.
01:01:29.000 --> 01:01:38.000
We have the rank of the pi matrix being equal to 0.
01:01:38.000 --> 01:01:40.000
Do not look for the negative sign here.
01:01:40.000 --> 01:01:43.000
I have just omitted the negative sign.
01:01:43.000 --> 01:01:48.000
Pi was negative A of 1, but negative A of 1 has the same rank as A of 1.
01:01:48.000 --> 01:01:55.000
So I can just drop the minus sign here.
01:01:55.000 --> 01:02:04.000
This case is that the rank of the A of 1 matrix is greater than 0, but less than k.
01:02:04.000 --> 01:02:14.000
So there is some rank R. The rank of the A of 1 matrix is R. For this rank R, we know
01:02:14.000 --> 01:02:20.000
that it is greater than 0, and it is smaller than k.
01:02:20.000 --> 01:02:26.000
In this case, we have, as we have shown, R co-integrating vectors.
01:02:26.000 --> 01:02:31.000
If we have R co-integrating vectors in a k-dimensional system, then apparently they exist k minus
01:02:31.000 --> 01:02:34.000
R independent stochastic trends.
01:02:34.000 --> 01:02:41.000
So we have R linearly independent co-integrating vectors and k minus R independent stochastic
01:02:41.000 --> 01:02:42.000
trends.
01:02:42.000 --> 01:02:45.000
That's a general feature of these type of models.
01:02:45.000 --> 01:02:50.000
The number of co-integrating vectors plus the number of independent stochastic trends
01:02:50.000 --> 01:02:56.000
always adds up to the dimension of the system, so it adds up to k.
01:02:56.000 --> 01:03:03.000
This means that the delta yt's do not have a VAR representation.
01:03:03.000 --> 01:03:07.000
Look again at perhaps equation 9.
01:03:07.000 --> 01:03:16.000
If the pi matrix is non-zero, then delta yt is equal to something which looks like a VAR
01:03:16.000 --> 01:03:20.000
representation plus something in levels.
01:03:20.000 --> 01:03:26.000
In a bar representation in the first differences, there's no space for the levels.
01:03:26.000 --> 01:03:33.000
So when we know that the pi matrix here is different from 0, then we know that no VAR
01:03:33.000 --> 01:03:36.000
representation for the first differences exists.
01:03:36.000 --> 01:03:44.000
If you try to estimate a VAR in first differences on a co-integrated system, then this result
01:03:44.000 --> 01:03:51.000
would tell us that your estimation is bravely misspecified.
01:03:51.000 --> 01:03:59.000
You cannot estimate a VAR in first differences on a set of variables which is co-integrated.
01:03:59.000 --> 01:04:03.000
You need to enclose the level information.
01:04:03.000 --> 01:04:08.000
The first differences do not contain level information, and therefore there's no way
01:04:08.000 --> 01:04:16.000
how a VAR in first differences can describe a co-integrated VAR.
01:04:16.000 --> 01:04:20.000
But a co-integrated VAR is what we should actually expect to have if we make economic
01:04:20.000 --> 01:04:26.000
analysis because we know that if there is an economic equilibrium, a long-run equilibrium,
01:04:26.000 --> 01:04:30.000
then this should show up as co-integration.
01:04:30.000 --> 01:04:38.000
So it should be rather the rule than the exception that we argue that there is no way to specify
01:04:38.000 --> 01:04:42.000
a VAR in the first differences.
01:04:42.000 --> 01:04:44.000
You can always specify it in the levels.
01:04:44.000 --> 01:04:46.000
That's fine.
01:04:46.000 --> 01:04:52.000
Estimate a VAR in the levels without any kind of restrictions, and that's fine.
01:04:52.000 --> 01:04:59.000
But a VAR in first differences does not exist for a co-integrated VAR.
01:04:59.000 --> 01:05:04.000
All right, where am I going?
01:05:04.000 --> 01:05:07.000
Wrong way.
01:05:07.000 --> 01:05:17.000
Now, the remaining possibility is that the rank of A of 1 is full, so is equal to K.
01:05:17.000 --> 01:05:18.000
What does this mean?
01:05:18.000 --> 01:05:24.000
This means that there is no unit root, and there is no stochastic trend.
01:05:24.000 --> 01:05:30.000
So YT has a stationary VAR representation.
01:05:30.000 --> 01:05:38.000
This is important to know because we will also, in some circumstances, test for a full
01:05:38.000 --> 01:05:40.000
rank of A of 1.
01:05:40.000 --> 01:05:46.000
This is the last step of the Johanzen test.
01:05:46.000 --> 01:05:50.000
So this possibility here is also given.
01:05:50.000 --> 01:05:59.000
In this case, we would not have any type of co-integration because there is no stochastic
01:05:59.000 --> 01:06:01.000
trend in the data.
01:06:01.000 --> 01:06:11.000
Therefore, YT has a stationary VARP representation, which again implies that we should not estimate
01:06:11.000 --> 01:06:15.000
a VAR in first differences because this would be over-differenced.
01:06:15.000 --> 01:06:20.000
We would actually then generate a unit root in the MA term, in the error term of this
01:06:20.000 --> 01:06:21.000
structure.
01:06:21.000 --> 01:06:27.000
If we difference variables, we do not need differencing anymore.
01:06:27.000 --> 01:06:34.000
Now, one complication is the fact that the decomposition of the pi matrix into alpha
01:06:34.000 --> 01:06:41.000
and beta prime is not unique because we can just pick any matrix, any non-singular matrix
01:06:41.000 --> 01:06:49.000
H. So R by R matrix H, which is non-singular, of which there are many, many, many.
01:06:49.000 --> 01:06:54.000
And then say, well, pi is equal to alpha, beta prime, but then apparently this is equal
01:06:54.000 --> 01:07:00.000
to alpha times H inverse C H beta prime.
01:07:00.000 --> 01:07:13.000
And we define alpha H inverse C times S alpha tilde and H beta inverse C as beta tilde prime.
01:07:13.000 --> 01:07:21.000
And then we see that this is another decomposition of the pi matrix of the type alpha times beta
01:07:21.000 --> 01:07:26.000
prime because H inverse C times H get just cancels, right?
01:07:26.000 --> 01:07:36.000
So we can always split up the alpha and beta and use a matrix H inverse C and it's a matrix
01:07:36.000 --> 01:07:42.000
H and it's inverse C in between of them to derive new alpha and beta matrices matrices,
01:07:42.000 --> 01:07:47.000
which serve exactly the same purpose.
01:07:47.000 --> 01:07:55.000
It is also clear that alpha tilde and beta tilde also have rank R because the H matrix
01:07:55.000 --> 01:07:57.000
is assumed to be non-singular.
01:07:57.000 --> 01:08:03.000
So we can just invent any H matrix, which is non-singular and of type R by R, and then
01:08:03.000 --> 01:08:12.000
we can transform the alpha beta matrices into alpha tilde, beta tilde prime matrices, which
01:08:12.000 --> 01:08:19.000
means that the beta matrix here, whose columns are the co-integrating vectors or the rows
01:08:19.000 --> 01:08:27.000
of the beta prime matrix, the co-integrating vectors, it means that this beta matrix here
01:08:27.000 --> 01:08:33.000
is just a base for the space of all co-integrating vectors.
01:08:33.000 --> 01:08:40.000
So we can actually derive any type of linear combination of the columns of the rows of
01:08:40.000 --> 01:08:49.000
the beta prime matrix here as different co-integrating vectors, but these would be co-integrating
01:08:49.000 --> 01:08:59.000
vectors which are included in this space spanned by the rows of the beta prime matrices.
01:08:59.000 --> 01:09:03.000
Or to put it differently and perhaps a little easier, any linear combination of the columns
01:09:03.000 --> 01:09:11.000
of beta is also a co-integrating vector of Y.
01:09:11.000 --> 01:09:17.000
I know this is not so easy, in particular if you hear this for the first time, are there
01:09:17.000 --> 01:09:18.000
any questions?
01:09:18.000 --> 01:09:19.000
Then please interrupt me.
01:09:19.000 --> 01:09:24.000
Raise your hand if you have questions.
01:09:24.000 --> 01:09:32.000
Apparently no questions.
01:09:32.000 --> 01:09:40.000
So in general, we cannot interpret a column of beta as a specific economic long-run equilibrium,
01:09:40.000 --> 01:09:48.000
but rather just as a base vector of the space of co-integrating coefficients.
01:09:48.000 --> 01:09:55.000
So we cannot expect that a vector beta, which we have estimated, I will later show you how
01:09:55.000 --> 01:10:03.000
to estimate these vectors as beta, that this vector beta corresponds to a long-run equilibrium
01:10:03.000 --> 01:10:07.000
relationship which we have found in an economic model.
01:10:07.000 --> 01:10:13.000
Rather what we can expect is, or what should be the case, is that if we have a long-run
01:10:13.000 --> 01:10:18.000
equilibrium relationship between certain variables in an economic model, and this economic model
01:10:18.000 --> 01:10:24.000
is true, then it should be the case that this long-run equilibrium relationship from the
01:10:24.000 --> 01:10:30.000
economic model can be given a representation of a linear combination of the estimated
01:10:30.000 --> 01:10:35.000
co-integrating vectors from any such beta matrix.
01:10:35.000 --> 01:10:43.000
So it should be a part of the space spanned by the elements of the beta prime matrix or
01:10:43.000 --> 01:10:47.000
the rows of the beta prime.
01:10:47.000 --> 01:10:58.000
Certainly one can use this indeterminacy of the beta matrix for purposes of normalization.
01:10:58.000 --> 01:11:03.000
And J.Mality, for instance, does in such a way that it always reports the beta prime
01:11:03.000 --> 01:11:10.000
matrices in such a form, for instance, for k equal to 5 and 3 co-integrating vectors.
01:11:10.000 --> 01:11:19.000
The beta prime matrix will be reported as an identity matrix first for an identity matrix
01:11:19.000 --> 01:11:28.000
of three dimensions, if r is equal to 3.
01:11:28.000 --> 01:11:34.000
And then two more columns, because we need two more columns since k is equal to 5.
01:11:34.000 --> 01:11:45.000
So the matrix which is used here with its three columns needs two more columns to be
01:11:45.000 --> 01:11:52.000
of type r by k, in this case, the beta prime here is of type r by k.
01:11:52.000 --> 01:12:00.000
This is one vector in this r block.
01:12:00.000 --> 01:12:04.000
This is another vector in this r block, and this is another vector in this r block.
01:12:04.000 --> 01:12:10.000
And then we have two vectors which are unrestricted where any value for the beta parameters may
01:12:10.000 --> 01:12:15.000
come up.
01:12:15.000 --> 01:12:23.000
If we want to identify an estimated co-integrating vector in a certain economic equilibrium sense,
01:12:23.000 --> 01:12:25.000
then we need additional identifying restrictions.
01:12:25.000 --> 01:12:29.000
And I cannot cover this in this lecture what can be done here.
01:12:29.000 --> 01:12:35.000
Actually, it's not even so important to specify out the identifying restrictions, because it's
01:12:35.000 --> 01:12:41.000
completely sufficient to have an economic model and compute what the co-integrating vector
01:12:41.000 --> 01:12:51.000
would be on paper in theory, and then just check whether this vector is in the space
01:12:51.000 --> 01:13:04.000
spanned by the base vectors of the space given in beta prime.
01:13:04.000 --> 01:13:10.000
Now if we can interpret some beta j as a long run equilibrium relationship in the way we
01:13:10.000 --> 01:13:19.000
would have written down in the model, then beta prime times y t minus 1 is a stationary
01:13:19.000 --> 01:13:23.000
deviation from equilibrium in periods t minus 1.
01:13:23.000 --> 01:13:31.000
I have written here this for some beta j vector, so 1j taken from the r co-integrating vectors
01:13:31.000 --> 01:13:35.000
which we have.
01:13:35.000 --> 01:13:42.000
Suppose that this r by 1 vector beta prime y t minus 1 contains the r deviations from
01:13:42.000 --> 01:13:50.000
the r long run equilibrium relationships.
01:13:50.000 --> 01:13:55.000
Then the alpha matrix of which we haven't talked that much so far, which is of format
01:13:55.000 --> 01:14:03.000
k by r, then this alpha matrix describes how the r deviations from equilibrium affect the
01:14:03.000 --> 01:14:08.000
k co-integrated dependent variables delta y t.
01:14:08.000 --> 01:14:16.000
Alpha is a matrix k by r, so for each component in the y vector, which are k components, alpha
01:14:16.000 --> 01:14:23.000
describes with what kind of importance, with what kind of weight, the deviations from equilibrium
01:14:23.000 --> 01:14:35.000
in the previous period affect the dependent variable for which alpha is the loading coefficient.
01:14:35.000 --> 01:14:41.000
This is why the alpha matrix is called the matrix of loading coefficients because it
01:14:41.000 --> 01:14:48.000
describes the sensitivity of current variables with respect to deviations from equilibrium
01:14:48.000 --> 01:14:56.000
one period earlier.
01:14:56.000 --> 01:15:04.000
Usually there is a tendency to correct deviations from equilibrium, so in a good economic model
01:15:04.000 --> 01:15:11.000
or a good econometric study you would typically find that the loading coefficients try to
01:15:11.000 --> 01:15:19.000
return the system to equilibrium, which means that if we have a variable y k t with a positive
01:15:19.000 --> 01:15:26.000
coefficient in the co-integrating vector beta j, then the loading coefficient alpha k j
01:15:26.000 --> 01:15:34.000
should be negative, but it depends, as I say, on the coefficient value for the co-integrating
01:15:34.000 --> 01:15:37.000
vector.
01:15:37.000 --> 01:15:42.000
Now equation 10 we have already discussed, there's nothing new about that.
01:15:42.000 --> 01:15:47.000
This is just equation 9 with pi being replaced by alpha beta prime.
01:15:47.000 --> 01:15:52.000
This equation 10 here, this representation here, is called a vector error correction
01:15:52.000 --> 01:16:02.000
model or vec n, and the r by 1 vector beta prime y t minus 1, so this vector here, is
01:16:02.000 --> 01:16:05.000
called the error correction term.
01:16:05.000 --> 01:16:15.000
The vec m is a restricted form of a VAR model in the sense that if we restrict, for instance,
01:16:15.000 --> 01:16:22.000
this beta matrix to have a certain rank, r, then we postulate that there are r co-integrating
01:16:22.000 --> 01:16:32.000
vectors, and therefore this representation here is a representation of a VAR model.
01:16:32.000 --> 01:16:38.000
Consider the restriction that there are co-integrating vectors, and then it is called vector error
01:16:38.000 --> 01:16:43.000
correction model.
01:16:43.000 --> 01:16:53.000
Now, the famous article on co-integration was published by Engel and Grainger in Ecometrica
01:16:53.000 --> 01:16:58.000
1985, if I recall correctly.
01:16:58.000 --> 01:17:03.000
It turned out that while this is a paper which has been cited over and over again, probably
01:17:03.000 --> 01:17:11.000
100,000 or millions of times since Engel and Grainger laid out a theory of co-integration
01:17:11.000 --> 01:17:19.000
there, they told us privately some years ago that they had big trouble getting their paper
01:17:19.000 --> 01:17:25.000
through in Econometrica, and it had been turned down by other journals because referees just
01:17:25.000 --> 01:17:28.000
didn't understand what people were talking about.
01:17:28.000 --> 01:17:34.000
If you have trouble in understanding what I'm talking about, here, be assured that this
01:17:34.000 --> 01:17:38.000
is sort of the normal state of affairs.
01:17:38.000 --> 01:17:43.000
These matters need study, and you need to devote time to them in order to understand
01:17:43.000 --> 01:17:44.000
them.
01:17:44.000 --> 01:17:49.000
Eventually, Engel and Grainger got their paper through in Econometrica, and people read it
01:17:49.000 --> 01:17:56.000
and understood it after some time, and co-integration became an extremely popular tool in time series
01:17:56.000 --> 01:17:57.000
analysis.
01:17:57.000 --> 01:18:05.000
Actually, you cannot do time series analysis seriously today without knowing about co-integration,
01:18:05.000 --> 01:18:08.000
Johan's tests, and these kind of things.
01:18:08.000 --> 01:18:14.000
So you really have to know what I explained in this lecture.
01:18:14.000 --> 01:18:19.000
In the paper by Engel and Grainger, which, as I say, had two authors, Clive Grainger
01:18:19.000 --> 01:18:27.000
and Robert Engel, there is a theorem called the Grainger Representation Theorem.
01:18:27.000 --> 01:18:35.000
So Engel and Grainger, for some reasons, made clear in their article that this theorem was
01:18:35.000 --> 01:18:40.000
apparently developed by Clive Grainger, and Robert Engel didn't contribute to it.
01:18:40.000 --> 01:18:45.000
And perhaps it was important to Grainger to point this out, that this was what he found
01:18:45.000 --> 01:18:52.000
out, what he proved, and therefore, they gave it this name, which deviates from the name
01:18:52.000 --> 01:18:55.000
of the paper, which was Engel and Grainger.
01:18:55.000 --> 01:19:06.000
Anyway, the Grainger Representation Theorem says, each co-integrated var of equation 8
01:19:06.000 --> 01:19:09.000
has a vec M representation, like in 10.
01:19:09.000 --> 01:19:14.000
So let's go back to see what I'm talking about.
01:19:14.000 --> 01:19:16.000
This was equation 8.
01:19:16.000 --> 01:19:26.000
So if I have a VAR, and this is the general form of a VAR, if I have a VAR and it is co-integrated,
01:19:26.000 --> 01:19:32.000
it has a co-integration property, this was the question we came from, how can we distinguish
01:19:32.000 --> 01:19:35.000
a co-integrated var for one, which is not?
01:19:35.000 --> 01:19:44.000
So if it is a co-integrated var, then it does have a vector error correction representation
01:19:44.000 --> 01:19:54.000
of type 10 here, but it can be written as an equation which has lots of first differences,
01:19:54.000 --> 01:20:02.000
and the level of the length period of exactly the previous period t minus 1 in it, with
01:20:02.000 --> 01:20:09.000
a beta matrix which is non-zero, which has rank r greater than zero.
01:20:09.000 --> 01:20:18.000
So a co-integrated VAR has such a vector error correction representation, whereas non-co-integrated
01:20:18.000 --> 01:20:23.000
vars do not.
01:20:23.000 --> 01:20:31.000
The levels of the variables yt have what Grainger called a common trends representation.
01:20:32.000 --> 01:20:41.000
This formula here, it says yt is some starting value in period 0, y0, plus the cumulation
01:20:41.000 --> 01:20:45.000
of errors which have occurred since.
01:20:45.000 --> 01:20:50.000
This sounds very much like a vector moving average representation, and in fact the common
01:20:50.000 --> 01:21:01.000
trends representation is similar to a vector moving average representation, but with the
01:21:01.000 --> 01:21:08.000
deviation that the unit root property is visible in the common trends representation.
01:21:08.000 --> 01:21:19.000
Namely, here we have the y0s, and here we have an infinite sum of stationary white noise
01:21:19.000 --> 01:21:24.000
weighted by certain matrices psi star tau.
01:21:24.000 --> 01:21:32.000
We do not bother about what the psi star tau are exactly, but we know there are some
01:21:32.000 --> 01:21:40.000
matrices in such a way that we can build an infinite sum over the ut minus tau.
01:21:40.000 --> 01:21:46.000
So basically, think of the voltage representation where we would require that the psi's are
01:21:46.000 --> 01:21:50.000
squared summable.
01:21:50.000 --> 01:21:57.000
The psi's here need to satisfy some similar property in order to make sure that the sum
01:21:57.000 --> 01:21:58.000
here converges.
01:21:58.000 --> 01:22:08.000
But it is not just the sum here which drives this yt process, but also the sum here where
01:22:08.000 --> 01:22:14.000
just the ut processes are being summed, not weighted, just summed.
01:22:14.000 --> 01:22:20.000
So there is one or an identity matrix in front of each of these ut tau's here, and
01:22:20.000 --> 01:22:30.000
tau goes back from one to period t, so possibly quite far back in the past.
01:22:30.000 --> 01:22:35.000
As I say, the psi square matrices or the psi star, excuse me, psi star matrices must be
01:22:35.000 --> 01:22:37.000
absolutely summable.
01:22:37.000 --> 01:22:41.000
This is analogous to the white representation, but this term here does not appear in the
01:22:41.000 --> 01:22:46.000
white representation.
01:22:46.000 --> 01:22:51.000
There is actually an algebraic expression for the psi matrix, so you can compute it.
01:22:51.000 --> 01:22:57.000
All these things here, beta orthogonal is a matrix which spans the space, which is exactly
01:22:57.000 --> 01:23:00.000
orthogonal to the space spanned by beta.
01:23:00.000 --> 01:23:05.000
This you don't need to know, I don't even care if you don't understand it, so I won't
01:23:05.000 --> 01:23:11.000
ask you about that in the final exam, but if you have a computer and program something
01:23:11.000 --> 01:23:15.000
about cointegration, then it's not difficult to program this expression here, which is
01:23:15.000 --> 01:23:20.000
why I give it to you on this slide.
01:23:20.000 --> 01:23:28.000
More important is that the rank of psi is equal to k minus r. k minus r was the number
01:23:28.000 --> 01:23:36.000
of stochastic trends, so we see this accumulation of white noise here is of course a stochastic
01:23:36.000 --> 01:23:42.000
trend or actually these are k minus r independent stochastic trends.
01:23:42.000 --> 01:23:50.000
This is what is meant by the common trends representation, that we have trends which
01:23:50.000 --> 01:24:03.000
are common to some of the variables and therefore can cancel out by cointegration.
01:24:03.000 --> 01:24:10.000
So a cointegrated var p has a vec m p minus one representation, so the order of the process
01:24:10.000 --> 01:24:19.000
is reduced by one when we move from levels in which the var is being defined to the first
01:24:19.000 --> 01:24:26.000
differences in which the vec m is being defined.
01:24:26.000 --> 01:24:30.000
How do we know that this is a vec m of p minus one representation?
01:24:30.000 --> 01:24:37.000
We know it because equation 10 contains p minus one length differences, which we have
01:24:37.000 --> 01:24:47.000
here p minus one representation for the vec m.
01:24:47.000 --> 01:24:52.000
So what is the significance of the Granger representation theorem?
01:24:52.000 --> 01:24:58.000
The first vec m constitutes a relationship between the first differences delta yt, the
01:24:58.000 --> 01:25:04.000
lagged first differences delta yt minus i and, and this is the important thing, the
01:25:04.000 --> 01:25:06.000
lagged lengths.
01:25:06.000 --> 01:25:14.000
So when you regress variables in first differences in a time series context and these variables
01:25:14.000 --> 01:25:18.000
happen to be cointegrated, which may easily be the case because we work with economic
01:25:18.000 --> 01:25:24.000
variables and economic models imply long-run equilibrium relationships, if you work with
01:25:24.000 --> 01:25:31.000
economic variables in first differences, then you have to provide leveled variables
01:25:31.000 --> 01:25:34.000
as explanatory variables.
01:25:34.000 --> 01:25:44.000
If you just regress delta yt on its own path on delta t minus i, then the process is misspecified.
01:25:44.000 --> 01:25:51.000
The regression equation is misspecified and then you will estimate wrong coefficients.
01:25:51.000 --> 01:25:57.000
You have to include the levels of the variables if the variables are cointegrated or could
01:25:57.000 --> 01:26:00.000
possibly be cointegrated.
01:26:00.000 --> 01:26:07.000
There's no harm done by including the levels of the variable if they are not needed because
01:26:07.000 --> 01:26:14.000
very easily we can estimate this thing here as zero and then it's fine, but not including
01:26:14.000 --> 01:26:22.000
the yt minus ones when the alpha beta prime matrix is different from zero would involve
01:26:22.000 --> 01:26:31.000
a big error, would involve misspecification and wrong results.
01:26:31.000 --> 01:26:36.000
Obviously without cointegration, we have that alpha beta prime is equal to zero.
01:26:36.000 --> 01:26:42.000
So then delta yt is independent of the length levels of the yt minus ones and you can in
01:26:42.000 --> 01:26:49.000
principle estimate delta yt on delta yt minus one, delta yt minus i.
01:26:49.000 --> 01:26:55.000
So on all the delta yt minus i, there's no problem with that, but clearly the condition
01:26:55.000 --> 01:27:04.000
is that the process yt is not cointegrated.
01:27:04.000 --> 01:27:13.000
So a VAR in levels without cointegration can always be written as a bar in first differences,
01:27:13.000 --> 01:27:19.000
but a VAR with cointegration, a VAR in levels with cointegration cannot be written as a
01:27:19.000 --> 01:27:21.000
VAR in first differences.
01:27:21.000 --> 01:27:25.000
That's basically the distinguishing property.
01:27:25.000 --> 01:27:31.000
A cointegrated bar in levels cannot be written as a VAR in first differences because this
01:27:31.000 --> 01:27:39.000
would omit the influence of the length levels and thereby omit the influence of the error
01:27:39.000 --> 01:27:41.000
correction mechanism.
01:27:41.000 --> 01:27:51.000
The deviations from equilibrium could not be properly computed.
01:27:51.000 --> 01:27:54.000
So this is why it would be a misspecification that I have said if a vector of cointegrated
01:27:54.000 --> 01:28:00.000
variables were modeled as a VAR in first differences and there is no misspecification if a vector
01:28:00.000 --> 01:28:09.000
of cointegrated variables is modeled as a VAR in levels.
01:28:09.000 --> 01:28:15.000
Second important thing about the Granger representation theorem is that the analog of the infinite
01:28:15.000 --> 01:28:23.000
VMA representation of the stationary bar is the common trends representation for a BECM.
01:28:23.000 --> 01:28:27.000
This is more a technical significance, which I will not emphasize any further.
01:28:27.000 --> 01:28:37.000
So let me move on to the third significant point of the Granger presentations theorem.
01:28:37.000 --> 01:28:44.000
Since delta YT is always stationary by definition, it can always be written as a BMA.
01:28:44.000 --> 01:28:55.000
So we can always write delta YT is equal to C of L U T even if there is no cointegration
01:28:55.000 --> 01:29:00.000
or independent of whether there is a cointegration.
01:29:00.000 --> 01:29:08.000
So we see that C of one is then the long run effect of the U variables on the level.
01:29:08.000 --> 01:29:12.000
And this means that the C of one is equal to psi matrix, which we have just seen in
01:29:12.000 --> 01:29:14.000
the Granger representation theorem.
01:29:14.000 --> 01:29:24.000
Therefore, we know that the rank of C of one is equal to K minus R cannot be any higher.
01:29:24.000 --> 01:29:30.000
This means that cointegration implies that the long run multiplier matrix of delta Y
01:29:30.000 --> 01:29:38.000
is singular and has rank of K minus R. So basically it means variables cannot just
01:29:38.000 --> 01:29:40.000
move wherever they want to.
01:29:40.000 --> 01:29:49.000
They have to do this in common and have to follow a common linear trend.
01:29:49.000 --> 01:29:53.000
Okay, let us perhaps break here.
01:29:53.000 --> 01:30:00.000
And after the break, I will show you how you can estimate a VECM of order one in this case
01:30:00.000 --> 01:30:06.000
in JMLT on real world data, namely delta GDP, data consumption, data investment, and data
01:30:06.000 --> 01:30:09.000
total factor productivity.
01:30:09.000 --> 01:30:17.000
We continue the lecture in five minutes.
01:30:17.000 --> 01:30:27.000
Okay, let us carry on.
01:30:27.000 --> 01:30:35.000
As I said, we will now estimate a VECM with just one leg on a certain set of microeconomic
01:30:35.000 --> 01:30:45.000
data, which I have supplied to you and steamy, you can reproduce what I do here in JMLT.
01:30:45.000 --> 01:30:54.000
And the variables in first difference form are written as delta YT is the growth rate
01:30:54.000 --> 01:30:55.000
of GDP.
01:30:55.000 --> 01:30:59.000
So all the variables are logs, growth rate of consumption, growth rate of investment,
01:30:59.000 --> 01:31:02.000
and growth rate of total factor productivity.
01:31:03.000 --> 01:31:09.000
Now, suppose that all of these variables share the same stochastic trend.
01:31:09.000 --> 01:31:16.000
So suppose that there is just one stochastic trend in the data.
01:31:16.000 --> 01:31:19.000
This I put here as an assumption.
01:31:19.000 --> 01:31:21.000
We will not yet test it.
01:31:21.000 --> 01:31:23.000
This comes later, right?
01:31:23.000 --> 01:31:31.000
Testing for the number of co-integrating vectors or for the number of stochastic trends.
01:31:31.000 --> 01:31:37.000
So if it is true that there is just one stochastic trend, let's say technology, somehow in the
01:31:37.000 --> 01:31:43.000
TFP process, then there should be three co-integrating vectors.
01:31:43.000 --> 01:31:50.000
So this would mean in our VECM representation, the matrix alpha would have dimension four
01:31:50.000 --> 01:31:56.000
by three and would be non-singular or not really non-singular, but the matrix would
01:31:56.000 --> 01:32:03.000
have full rank, namely full column rank with rank three, and beta prime would have dimensions
01:32:03.000 --> 01:32:07.000
three by four and would have full row rank, right?
01:32:07.000 --> 01:32:14.000
Beta would have full column rank, beta prime would have full row rank, namely again three,
01:32:14.000 --> 01:32:16.000
which is our R.
01:32:16.000 --> 01:32:19.000
We estimate this model here.
01:32:19.000 --> 01:32:24.000
We can do so in the VECM analysis menu of J-Marty.
01:32:24.000 --> 01:32:34.000
The result of the estimation would be given in this matrix VEC representation form here.
01:32:34.000 --> 01:32:40.000
You would have the difference of GDP, difference of consumption, difference of investment, difference
01:32:40.000 --> 01:32:45.000
of total vector productivity as the YT vector.
01:32:45.000 --> 01:32:55.000
This YT vector would be written as a product of a alpha matrix here and a beta prime matrix
01:32:55.000 --> 01:32:56.000
here.
01:32:56.000 --> 01:33:04.000
So you see here the lag levels, GDP T minus one, no Ds, not the difference of GDP, but
01:33:04.000 --> 01:33:10.000
the levels of GDP, level of consumption, level of investment, level of total vector productivity.
01:33:10.000 --> 01:33:17.000
This would be the matrix of loading coefficients, and this would be the matrix of co-integrating
01:33:17.000 --> 01:33:18.000
vectors.
01:33:18.000 --> 01:33:23.000
The first co-integrating vector, as I say, J-Marty uses a certain convention here, has
01:33:23.000 --> 01:33:33.000
a one here, nothing here and there, and then negative 1.5 here, so GDP would be co-integrated
01:33:33.000 --> 01:33:36.000
with technology.
01:33:36.000 --> 01:33:42.000
The second co-integrating vector would say there's a zero here, a one here, so consumption
01:33:42.000 --> 01:33:50.000
as the second component here would be co-integrated again with technology, TFP T minus one here.
01:33:50.000 --> 01:33:57.000
The co-integrating vector would be one and minus 1.777.
01:33:57.000 --> 01:34:03.000
The third co-integrating vector is the co-integration of investment with technology, which has coefficient
01:34:03.000 --> 01:34:10.000
one and minus 2.9 as co-integrating coefficients.
01:34:10.000 --> 01:34:17.000
Basically it is to be expected in this type of representation that the coefficients here
01:34:17.000 --> 01:34:24.000
are negative, just like in my example earlier on, where income Y and consumption in the
01:34:24.000 --> 01:34:32.000
permanent income hypothesis example had a co-integrating vector of one minus one, here we have co-integrating
01:34:32.000 --> 01:34:38.000
vectors of one minus 1.5 minus 1.7 minus three, essentially.
01:34:38.000 --> 01:34:45.000
So always it would be the case that a stochastic trend, which is in here, would be canceled
01:34:45.000 --> 01:34:47.000
by the stochastic trend in here.
01:34:47.000 --> 01:34:57.000
However, apparently this trend is dampened somewhat, no actually it is amplified, it
01:34:57.000 --> 01:35:07.000
is amplified in the GDP consumption and investment series because we have to take multiple of
01:35:07.000 --> 01:35:15.000
the stochastic trend in total vector productivity in order to cancel it in this type of co-integrating
01:35:15.000 --> 01:35:16.000
combination here.
01:35:16.000 --> 01:35:22.000
And then of lesser interest is what you find in the second row here, where we have the
01:35:22.000 --> 01:35:29.000
lagged differences, d GDP T minus one, d C, d investment, d TFP.
01:35:29.000 --> 01:35:35.000
So these are the delta variables, the first differences of the four variables with certain
01:35:35.000 --> 01:35:43.000
coefficients, which typically nobody looked at when we estimate such system plus the constant,
01:35:43.000 --> 01:35:50.000
which is also usually not very interesting plus of course estimated residuals, which
01:35:50.000 --> 01:35:52.000
are not shown in numbers here.
01:35:52.000 --> 01:35:58.000
You can retrieve them from JMality, but there's no need to document them.
01:35:58.000 --> 01:36:04.000
In JMality, you would see the coefficients as you see them here, if you press on coefficients,
01:36:04.000 --> 01:36:08.000
you would see their standard deviations, if you press on standard deviations and you would
01:36:08.000 --> 01:36:18.000
see the T values in the same type of layout if you pressed on this button T values here.
01:36:18.000 --> 01:36:23.000
Any question here?
01:36:23.000 --> 01:36:30.000
You can also go from matrix vec representation to matrix VAR representation.
01:36:30.000 --> 01:36:40.000
So JMality would convert the estimated vector error correction model to a VAR model in standard
01:36:40.000 --> 01:36:49.000
form where this would be a VAR in levels, GDP T, consumption T, investment T, TFP T
01:36:49.000 --> 01:36:58.000
would be regressed on their lag levels in period or with a lag one, T minus one and
01:36:58.000 --> 01:37:04.000
the GDP and the same lag levels in period T minus two.
01:37:04.000 --> 01:37:14.000
So if the vecM has order P minus one, then the VAR in levels would have an order which
01:37:14.000 --> 01:37:43.000
is by one greater than the order of the vecM level.
01:37:43.000 --> 01:37:51.000
So the dimension of the system in a VAR would be higher by one and what is more important
01:37:51.000 --> 01:38:00.000
than that actually, this VAR would be a restricted VAR because it involves the fact that it can
01:38:00.000 --> 01:38:08.000
be transformed into a vecM model so that it has the property of code regression.
01:38:08.000 --> 01:38:15.000
If you estimated this model directly as a VAR, you would arrive at different numbers
01:38:15.000 --> 01:38:22.000
because you would estimate it without the restriction.
01:38:22.000 --> 01:38:26.000
If we look at the residuals, we see that we have high contemporaneous correlation for
01:38:26.000 --> 01:38:28.000
the residuals.
01:38:28.000 --> 01:38:33.000
So the residuals are certainly not representing structural shocks.
01:38:33.000 --> 01:38:39.000
So Jay Malty displays the covariance matrix to you, which you see here.
01:38:39.000 --> 01:38:44.000
More interesting is the correlation matrix, which results from that as you see that the
01:38:44.000 --> 01:38:49.000
correlation between the residuals is partially high.
01:38:49.000 --> 01:38:58.000
So this number here is 0.67, so almost 0.7, quite a high correlation, right?
01:38:58.000 --> 01:39:05.000
Or here a correlation of 0.56, so more than 0.5 as correlation, right?
01:39:05.000 --> 01:39:13.000
So the correlation is rather high and the residuals are certainly not structural shocks.
01:39:13.000 --> 01:39:21.000
Now in the last half hour or so of this lecture, I will show to you how you can test for code
01:39:21.000 --> 01:39:23.000
integration.
01:39:23.000 --> 01:39:30.000
Because in the example which I just explained to you, I assumed that there is a certain
01:39:30.000 --> 01:39:32.000
number of code integrating vectors.
01:39:32.000 --> 01:39:37.000
I assumed that there are three code integrating vectors because I assumed that there is just
01:39:37.000 --> 01:39:39.000
one stochastic trend.
01:39:39.000 --> 01:39:44.000
One can work with these type of assumptions, for instance, if you have a model, right?
01:39:44.000 --> 01:39:49.000
What it can do is that you say, I have a certain model and in this model, I assume that there
01:39:49.000 --> 01:39:59.000
is just one source of stochastics in the model, which is, for instance, shocks to technology.
01:39:59.000 --> 01:40:05.000
And then clearly there would be just one stochastic trend driven by technology shocks.
01:40:05.000 --> 01:40:10.000
And therefore, if we have just one stochastic trend in a four dimensional system, you need
01:40:10.000 --> 01:40:14.000
to have three code integrating vectors.
01:40:14.000 --> 01:40:20.000
But in general, we take a number of variables and we would like to know how many code integrating
01:40:20.000 --> 01:40:22.000
combinations do we find in the system.
01:40:22.000 --> 01:40:26.000
Therefore, we have to test for code integration.
01:40:26.000 --> 01:40:32.000
And this type of test was developed by Zeron Johansson and published in 1988, circulated
01:40:32.000 --> 01:40:39.000
already much earlier in terms of the form of a discussion paper, and much like the Engle
01:40:39.000 --> 01:40:46.000
Granger paper, and it's replaced another test which Engle Granger had developed, but
01:40:46.000 --> 01:40:53.000
which was not so, yeah, it was not so reliable, the Johansson test is much better.
01:40:53.000 --> 01:40:57.000
So I don't teach you the Engle Granger test.
01:40:57.000 --> 01:41:04.000
The idea is first, we estimate an unrestricted bar in levels.
01:41:04.000 --> 01:41:14.000
So while I just told you that, sorry, once again, that this VAR representation, which
01:41:14.000 --> 01:41:24.000
I have here, is a restricted VAR, namely a VAR in which we have already imposed the requirement
01:41:24.000 --> 01:41:27.000
that there are three code integrating vectors.
01:41:27.000 --> 01:41:34.000
Now in the Johansson procedure, we start out with just estimated and unrestricted VAR
01:41:34.000 --> 01:41:36.000
in levels.
01:41:36.000 --> 01:41:45.000
So we just regress by OLS equation by equation, yt on the lagged variables, up to as many
01:41:45.000 --> 01:41:48.000
legs as we would like to use.
01:41:48.000 --> 01:41:55.000
We choose, of course, the lag length of the VAR such that ut is then uncorrelated, so could
01:41:55.000 --> 01:41:59.000
be interpreted as white noise.
01:41:59.000 --> 01:42:07.000
When we have estimated this a of l polynomial here, then we determine the rank of a of 1,
01:42:07.000 --> 01:42:12.000
because a of 1 was the negative of the pi matrix, right, and you recall the pi matrix
01:42:12.000 --> 01:42:16.000
or its rank determines the number of code integrating vectors.
01:42:16.000 --> 01:42:22.000
So what we do is that we first estimate a of l, and then we sum all the coefficients
01:42:22.000 --> 01:42:33.000
of the a of l polynomial and determine thereby the pi matrix or negative pi, and then test
01:42:33.000 --> 01:42:40.000
for the number of non-zero eigenvalues, because the rank of a matrix is always equal to the
01:42:40.000 --> 01:42:48.000
number of non-zero eigenvalues, right, k by k matrix has always k eigenvalues, but if
01:42:48.000 --> 01:42:55.000
the k by k matrix is singular, then some of the eigenvalues will be equal to 0, and
01:42:55.000 --> 01:43:02.000
there will be as many non-zero eigenvalues as is the rank of the matrix.
01:43:02.000 --> 01:43:10.000
So if the rank of the k by k matrix is r, there will be r non-zero eigenvalues of the
01:43:10.000 --> 01:43:15.000
matrix i, or in this case, a of 1, which is the negative of i.
01:43:16.000 --> 01:43:23.000
So the test procedure in the so-called trace test, there's also another test developed
01:43:23.000 --> 01:43:27.000
by Johansson, which is less popular, which is the maximum eigenvalue test, but I teach
01:43:27.000 --> 01:43:32.000
you here just the popular test, which is the trace test.
01:43:32.000 --> 01:43:36.000
The test procedure is the following.
01:43:36.000 --> 01:43:47.000
We have the estimated a of 1 matrix, and now we test the null hypothesis, H0, that
01:43:47.000 --> 01:43:51.000
the rank of a of 1 is equal to 0.
01:43:51.000 --> 01:43:59.000
We know if the rank of a of 1 is equal to 0, then all the variables are not cointegrated
01:43:59.000 --> 01:44:08.000
at all, so there is no cointegration vector in this system.
01:44:08.000 --> 01:44:14.000
If rank of a of 1 is equal to 0, this would mean the a of 1 matrix is equal to 0, so the
01:44:14.000 --> 01:44:17.000
variables are not cointegrated.
01:44:17.000 --> 01:44:24.000
We test this null hypothesis, and if this null hypothesis is rejected, then we know
01:44:24.000 --> 01:44:31.000
that there is at least one cointegrating vector in the system, that there is at least
01:44:31.000 --> 01:44:39.000
one possibility to combine the variables in such a sense that this combination is stationary.
01:44:39.000 --> 01:44:47.000
So if H0 is rejected, then we test another hypothesis, which I call here H of 1.
01:44:47.000 --> 01:44:52.000
So the hypothesis of one cointegrating vector.
01:44:52.000 --> 01:44:58.000
If we cannot reject this hypothesis here, then the test procedure stops.
01:44:58.000 --> 01:45:01.000
So then we would just say the variables are not cointegrated, and that's the end of the
01:45:01.000 --> 01:45:03.000
Johanzen procedure.
01:45:03.000 --> 01:45:09.000
But if we reject the first step, the first test, then we move on to a second test, and
01:45:09.000 --> 01:45:15.000
we test that the number of cointegrating vectors is not greater than 1.
01:45:15.000 --> 01:45:21.000
So we test the hypothesis that the rank of the a of 1 matrix is less than or equal to
01:45:21.000 --> 01:45:28.000
1, which is the hypothesis that there is at most one cointegrating vector.
01:45:28.000 --> 01:45:34.000
If we cannot reject this hypothesis H of 1, then the Johanzen procedure stops here, and
01:45:34.000 --> 01:45:38.000
we conclude that there is one cointegrating vector.
01:45:38.000 --> 01:45:43.000
If we can reject this hypothesis, then we move on and test the hypothesis that there
01:45:43.000 --> 01:45:47.000
are not more than two cointegrating vectors.
01:45:47.000 --> 01:45:50.000
And again, if we accept it, then the test procedure is finished.
01:45:50.000 --> 01:45:56.000
Otherwise, we move on and test that there are at most three cointegrating vectors and
01:45:56.000 --> 01:45:58.000
so forth.
01:45:58.000 --> 01:46:02.000
In the second to last step for a k-dimensional vector, we would test the hypothesis that
01:46:02.000 --> 01:46:06.000
there are k minus 1 cointegrating vectors.
01:46:06.000 --> 01:46:12.000
So in a four-dimensional system, as we just looked at it in the example prior to the break,
01:46:12.000 --> 01:46:16.000
we would test for three cointegrating vectors.
01:46:16.000 --> 01:46:23.000
If even this hypothesis here were rejected, then we would accept the hypothesis that the
01:46:23.000 --> 01:46:30.000
rank of a of 1 is equal to k, and that would mean that the VAR is stationary, that there
01:46:30.000 --> 01:46:36.000
is just no stationary variable included in the YT vector.
01:46:36.000 --> 01:46:42.000
But this type of a result is very rare, and it doesn't occur often in practice.
01:46:42.000 --> 01:46:47.000
Usually you end up somewhere between, well, often it's quite possible that you find that
01:46:47.000 --> 01:46:54.000
variants are not cointegrated at all, so you stop over here, or you step here at step two,
01:46:54.000 --> 01:46:58.000
three, or up to eight.
01:46:58.000 --> 01:47:05.000
But very rarely, you reject the hypothesis that the number of cointegrating vectors is
01:47:05.000 --> 01:47:07.000
greater than k minus 1.
01:47:07.000 --> 01:47:11.000
It does occur, it does happen, but usually then there's something else wrong with your
01:47:11.000 --> 01:47:13.000
data or with your setup.
01:47:13.000 --> 01:47:22.000
So I don't really think that this result, that there's no cointegrating vector and all
01:47:22.000 --> 01:47:29.000
the variables are stationary, occurs very often if you have properly chosen the variables
01:47:29.000 --> 01:47:36.000
you test, and you have confirmed beforehand by a unit root test that there are actually
01:47:36.000 --> 01:47:40.000
unit roots in the system, then it can actually not be the case that you arrive at this last
01:47:40.000 --> 01:47:43.000
part here.
01:47:43.000 --> 01:47:51.000
So what is the test statistic by which we test whether, what is the number of nonzero
01:47:51.000 --> 01:47:52.000
eigenvalues?
01:47:52.000 --> 01:48:02.000
Well, the test statistic is derived as a likelihood ratio statistic where we compute the sum,
01:48:02.000 --> 01:48:12.000
perhaps you just look at this term here, where you compute the sum over the smallest eigenvalues
01:48:12.000 --> 01:48:14.000
which you find.
01:48:14.000 --> 01:48:21.000
So basically the computer computes for you all the eigenvalues of the A of 1 matrix,
01:48:21.000 --> 01:48:27.000
and then you pick the smallest eigenvalues and absolute values, so those which are closest
01:48:27.000 --> 01:48:33.000
to zero, because you want to know what the number of nonzero eigenvalues is, and you
01:48:33.000 --> 01:48:39.000
have a certain hypothesis that the number of nonzero eigenvalues is say less than one
01:48:39.000 --> 01:48:47.000
or less than two, less than some are zero, some are not.
01:48:47.000 --> 01:48:56.000
So you pick all the eigenvalues which are, or the eigenvalues between are not plus one
01:48:56.000 --> 01:48:58.000
and some are one.
01:48:58.000 --> 01:49:06.000
You pick all those eigenvalues which are small and absolute value, so close to zero, because
01:49:06.000 --> 01:49:13.000
if this is approximately zero here, then one minus zero is approximately one, and then
01:49:13.000 --> 01:49:17.000
the log of one is of course approximately equal to zero.
01:49:17.000 --> 01:49:23.000
So in this case you would arrive at a rather small eigenvalue statistic here.
01:49:23.000 --> 01:49:28.000
Both things are divided by t, that doesn't really play much of a role, but you would
01:49:28.000 --> 01:49:34.000
arrive here at something which should be zero under the null hypothesis.
01:49:34.000 --> 01:49:39.000
So if you derive a great number here, which is greatly different from zero, then you would
01:49:39.000 --> 01:49:44.000
reject the null hypothesis.
01:49:44.000 --> 01:49:51.000
So basically this test, which is called the trace test, derives its name because it evaluates
01:49:51.000 --> 01:49:56.000
the trace of a matrix whose main diagonal depends on the estimated eigenvalues lambda
01:49:56.000 --> 01:49:59.000
i hat of A of one.
01:49:59.000 --> 01:50:04.000
It is not that you sum the eigenvalues directly, but you sum log of one minus the eigenvalue.
01:50:04.000 --> 01:50:10.000
So there's some kind of transformation which you need in order to derive a well-defined
01:50:10.000 --> 01:50:14.000
asymptotic distribution.
01:50:14.000 --> 01:50:22.000
If the true co-integrating rank is this r not here, then lambda i, as I said, is equal
01:50:22.000 --> 01:50:28.000
to zero for all i which are greater than or not, and therefore for the estimated eigenvalues
01:50:28.000 --> 01:50:34.000
we should have that the log of one minus lambda i hat is approximately zero for all i's greater
01:50:34.000 --> 01:50:37.000
than or not.
01:50:37.000 --> 01:50:43.000
As I have already pointed out, the Johansson procedure works fine if yt contains both i1
01:50:43.000 --> 01:50:47.000
and i0 components.
01:50:47.000 --> 01:50:51.000
The distribution of the trace test statistic is non-standard, like we always have it in
01:50:51.000 --> 01:50:53.000
unit root econometrics.
01:50:53.000 --> 01:51:01.000
It depends in a rather subtle way on the deterministic components in the vecM, so on the constant
01:51:01.000 --> 01:51:06.000
and on trend, even on seasonal dummy variables.
01:51:06.000 --> 01:51:13.000
And therefore, if you look at tabulations of the Johansson test, you will always find
01:51:13.000 --> 01:51:19.000
that there are different cases, often up to five cases which are distinguished for different
01:51:19.000 --> 01:51:26.000
possibilities of how the constant and the trend interact with the stochastic part of
01:51:26.000 --> 01:51:28.000
the process.
01:51:28.000 --> 01:51:34.000
To give you just a flavor of what happens there is, to give you just a flavor of what
01:51:34.000 --> 01:51:42.000
happens there, let us decompose the process yt in a deterministic and a stochastic part.
01:51:42.000 --> 01:51:50.000
So suppose that yt is equal to mu0 plus mu1 times t plus xt, right?
01:51:50.000 --> 01:51:57.000
So essentially we say xt is something non-deterministic, this is just some type of VAR process, and
01:51:57.000 --> 01:52:04.000
yt is the sum of something deterministic, constant, and possibly a linear trend, plus
01:52:04.000 --> 01:52:07.000
a standard VAR.
01:52:07.000 --> 01:52:15.000
And the VAR xt can be written as a of lxt is equal to ut, so ut is one loss, right?
01:52:15.000 --> 01:52:20.000
So the expectation of this x here is actually zero.
01:52:20.000 --> 01:52:29.000
Now clearly if a of lxt is equal to ut, and we have data on yt which possibly involves
01:52:29.000 --> 01:52:40.000
some type of deterministic components, then we can write the xt here as yt minus mu0 minus
01:52:40.000 --> 01:52:42.000
mu1t, right?
01:52:42.000 --> 01:52:48.000
So the term in parentheses here is the same as the xt here, and this makes use of equation
01:52:49.000 --> 01:52:56.000
12, where we can solve for xt such that xt is equal to yt minus mu0 minus mu1t.
01:52:56.000 --> 01:53:01.000
So a of l times this term in parentheses here is equal to zero.
01:53:01.000 --> 01:53:07.000
Now to keep things easy, first consider the case where mu1 is equal to zero, so there
01:53:07.000 --> 01:53:11.000
is no deterministic trend component.
01:53:12.000 --> 01:53:19.000
In this case, we would know that the expectation of yt is equal to mu0, just the constant, right?
01:53:19.000 --> 01:53:29.000
So we rewrite now a of l yt as a of 1 times mu0 plus ut, right?
01:53:29.000 --> 01:53:33.000
So we just use this equation 14 here, right?
01:53:33.000 --> 01:53:42.000
And write a of l yt, well, yt is equal to mu0 plus xt, right?
01:53:42.000 --> 01:53:51.000
So we write this now as a of l yt is equal to a of 1 mu0 plus ut.
01:53:51.000 --> 01:53:58.000
And that then, since a of 1 is equal to negative alpha times beta prime, a of 1 is equal to
01:53:58.000 --> 01:54:04.000
negative pi, and pi is equal to alpha times beta prime, beta prime, we can write as negative
01:54:04.000 --> 01:54:08.000
alpha beta prime mu0 plus ut.
01:54:08.000 --> 01:54:17.000
So the corresponding VicM representation is taken from equation 10, delta yt is constant
01:54:17.000 --> 01:54:29.000
nu, which I have defined here, plus alpha beta prime yt minus 1 plus, well, lagged differences,
01:54:29.000 --> 01:54:33.000
which are not so important, plus ut.
01:54:33.000 --> 01:54:42.000
And for the alpha beta prime yt minus 1, we can, at the nu here, we can actually rewrite
01:54:42.000 --> 01:54:49.000
the nu as a of 1 mu0, right, or negative alpha beta prime mu0.
01:54:49.000 --> 01:54:55.000
So then we can write this as alpha beta prime times yt minus 1 minus mu0 plus lagged differences
01:54:55.000 --> 01:54:58.000
of yt plus ut.
01:54:58.000 --> 01:55:05.000
So you see here that the constant term, in this case, is included in the linear combination,
01:55:05.000 --> 01:55:11.000
which constitutes the cointegrating vector.
01:55:11.000 --> 01:55:17.000
Or another, the constant augments the error correction term, so that the error correction
01:55:17.000 --> 01:55:22.000
term has an expected value of 0.
01:55:22.000 --> 01:55:28.000
Due to this representation, we immediately see that the expectation of alpha beta prime
01:55:28.000 --> 01:55:36.000
yt minus mu0 is equal to the expectation of beta prime times yt minus 1 minus mu0, because
01:55:36.000 --> 01:55:42.000
the alphas are just constant coefficients, are just loading coefficients.
01:55:42.000 --> 01:55:49.000
Formally, this can be derived by pre-multiplying this equation here by alpha prime alpha inverse
01:55:49.000 --> 01:55:54.000
times alpha prime, but this is not, this is technicality, which is not so important here.
01:55:54.000 --> 01:55:59.000
I think the intuition should be clear.
01:55:59.000 --> 01:56:06.000
So the expectation of delta yt is equal to 0, and therefore, yt does not have a deterministic
01:56:06.000 --> 01:56:07.000
trend.
01:56:07.000 --> 01:56:17.000
Recall, if the expectation of delta yt were non-zero, so if delta yt had a constant term,
01:56:17.000 --> 01:56:23.000
then this would mean under unit root econometrics that the level has a deterministic trend.
01:56:23.000 --> 01:56:32.000
But in this particular case, where we can decompose the mu0 variable, or better, the
01:56:32.000 --> 01:56:39.000
new variable, where we can decompose the new variable as alpha beta prime times mu0,
01:56:39.000 --> 01:56:46.000
in this particular case, we see that the delta yt does not have a constant, and therefore,
01:56:46.000 --> 01:56:51.000
the yt doesn't trend.
01:56:51.000 --> 01:56:55.000
So the distribution of the Johansson test now is different in the cases where alpha
01:56:55.000 --> 01:57:03.000
beta prime mu0 is equal to 0, and where alpha beta prime mu0 is different from 0.
01:57:03.000 --> 01:57:09.000
So it depends on whether this thing here, this new, is equal to 0, or whether this new
01:57:09.000 --> 01:57:13.000
is not equal to 0.
01:57:13.000 --> 01:57:16.000
Typically, this first case is rather unimportant.
01:57:16.000 --> 01:57:22.000
It doesn't occur very often in practice, and actually, some software just doesn't consider
01:57:22.000 --> 01:57:24.000
this case.
01:57:24.000 --> 01:57:29.000
The more important case is the second case that alpha beta prime mu0 is different from
01:57:29.000 --> 01:57:32.000
0.
01:57:32.000 --> 01:57:46.000
JMLT, for instance, implements only this second case, and this case is then labeled constant.
01:57:46.000 --> 01:57:54.000
Since the alpha is of the dimension k by r, we know that the rank of alpha is r, of course,
01:57:54.000 --> 01:58:04.000
so we know that alpha beta prime mu0 is equal to 0, exactly if and only if beta prime mu0
01:58:04.000 --> 01:58:10.000
is equal to 0, which, again, you can easily see by pre-multiplying with the pseudo-inverse
01:58:10.000 --> 01:58:11.000
of alpha.
01:58:11.000 --> 01:58:17.000
This here is the pseudo-inversity in more panoramas, inversity of alpha.
01:58:17.000 --> 01:58:24.000
So in case 1, the error correction term beta prime yt minus 1 has a mean of 0, while in
01:58:24.000 --> 01:58:29.000
case 2, the mean is different from 0.
01:58:29.000 --> 01:58:36.000
Now, exactly the same thing you can do with the trend term, the linear deterministic trend,
01:58:36.000 --> 01:58:40.000
which we had in the representation of yt.
01:58:40.000 --> 01:58:45.000
So consider now the case where mu1 is different from 0.
01:58:45.000 --> 01:58:50.000
And the expectation of yt is mu0 plus mu1t.
01:58:50.000 --> 01:58:55.000
So some components of the ys have a linear trend, right?
01:58:55.000 --> 01:59:04.000
Therefore, again, we can write the ut's as a of l times yt minus mu0, now also minus mu1t.
01:59:04.000 --> 01:59:15.000
And this can be rewritten with a few transformations as a of l times yt, a of 1 times mu0, because
01:59:15.000 --> 01:59:18.000
a constant doesn't need the leg operator.
01:59:18.000 --> 01:59:24.000
So the leg operator can be replaced by 1 here.
01:59:24.000 --> 01:59:29.000
And then we have the a of l times mu1t, right?
01:59:29.000 --> 01:59:33.000
And this we write here as negative mu1t.
01:59:33.000 --> 01:59:40.000
This is the term which corresponds to the identity matrix plus the sum over the ai times
01:59:40.000 --> 01:59:42.000
mu1t minus i.
01:59:43.000 --> 01:59:50.000
Now, I want to take the t component here apart from the negative i component here, which
01:59:50.000 --> 01:59:52.000
I do here.
01:59:52.000 --> 02:00:00.000
If I just have the t component here, then I can again write an a of 1 term here, a of
02:00:00.000 --> 02:00:09.000
1 times mu1 times t, because I can just factor out the t here, pull it out of the sum, and
02:00:09.000 --> 02:00:17.000
then I have the sum over the a of i times mu1, but the sum over the a of i's is then
02:00:17.000 --> 02:00:30.000
just equal to the a of 1 mu1t matrix plus identity matrix times t minus, well, all the
02:00:30.000 --> 02:00:35.000
i components here multiplied by a of i mu1.
02:00:35.000 --> 02:00:47.000
So that I can, yeah, that I can combine now these two terms, the a of 1 mu not term here
02:00:47.000 --> 02:00:57.000
with the a of i mu1i term here to yield a new constant.
02:00:57.000 --> 02:01:01.000
But this thing here with all those i terms is some type of constant, and this is also
02:01:01.000 --> 02:01:03.000
a constant.
02:01:03.000 --> 02:01:09.000
So I have some constant mu here, and I have alpha beta prime mu1t here, and the alpha
02:01:09.000 --> 02:01:16.000
beta prime again is, of course, the negative i matrix.
02:01:16.000 --> 02:01:23.000
So we will get a of lyt is equal to nu, some unrestricted constant, minus alpha beta prime
02:01:23.000 --> 02:01:27.000
mu1t plus ut.
02:01:27.000 --> 02:01:32.000
And therefore, the corresponding big M representation would be that the first difference of the
02:01:32.000 --> 02:01:40.000
delta of the yt's is equal to a constant nu plus alpha beta prime times yt minus 1 minus
02:01:40.000 --> 02:01:48.000
mu1t plus all those like differences plus stationary white noise.
02:01:48.000 --> 02:01:55.000
So the constant is unrestricted, and the linear term augments the error correction mechanism
02:01:55.000 --> 02:02:02.000
here, right, augments the error correction term, such that the expected value of it is
02:02:02.000 --> 02:02:04.000
constant.
02:02:04.000 --> 02:02:09.000
So we would have the expectation of the error correction term of the beta prime yt minus
02:02:09.000 --> 02:02:16.000
1 minus mu1t, which is this term here, is equal to beta prime times the expectation
02:02:16.000 --> 02:02:28.000
of, well, yt minus 1 minus mu1 times t minus 1, and this we can call mu0 minus mu1.
02:02:28.000 --> 02:02:35.000
So it is equal to beta prime times mu0 minus mu1, and therefore it is constant.
02:02:35.000 --> 02:02:43.000
Now, due to the fact that the expectation of the delta yt is constant, we would have
02:02:43.000 --> 02:02:48.000
a deterministic trend if the constant is different from 0.
02:02:48.000 --> 02:02:52.000
So the distribution of the Johan's test is, again, different in the cases where alpha
02:02:52.000 --> 02:03:02.000
beta prime mu1 is equal to 0 and where alpha beta prime mu1 is different from 0.
02:03:02.000 --> 02:03:11.000
Case 3 may occur despite of mu1 itself being different from 0 if the trends in the components
02:03:11.000 --> 02:03:14.000
of yt cancel each other.
02:03:14.000 --> 02:03:18.000
Case 4, on the other hand, implies that the co-integrating linear combination beta prime
02:03:18.000 --> 02:03:25.000
yt minus 1 has a linear trend, because then we can write e of beta prime yt minus 1 is
02:03:25.000 --> 02:03:33.000
equal to beta prime e of yt minus 1 is equal to beta prime mu0 plus beta prime mu1 times
02:03:33.000 --> 02:03:35.000
t minus 1, and here it is.
02:03:35.000 --> 02:03:41.000
This is the linear trend.
02:03:41.000 --> 02:03:52.000
So beta prime yt minus 1 is trend stationary, and subtracting beta prime mu1t just offsets
02:03:52.000 --> 02:03:53.000
this trend stationarity.
02:03:53.000 --> 02:03:59.000
So it offsets this trend, such that the error correction term then fluctuates about a constant
02:03:59.000 --> 02:04:00.000
e.
02:04:01.000 --> 02:04:09.000
In these two cases, case 3 and 4, the trend is confined to the co-integrating relationship.
02:04:09.000 --> 02:04:15.000
There's also a fifth case, which is, again, not so often occurred in practice.
02:04:15.000 --> 02:04:21.000
So that is not implemented in J-multi, but you would find it in E-bios, where we have
02:04:21.000 --> 02:04:24.000
an unrestricted linear trend in the vecM.
02:04:24.000 --> 02:04:30.000
This case would be appropriate if yt has a quadratic trend, which we don't see so often
02:04:30.000 --> 02:04:36.000
in real-world economic data, which is the reason why J-multi hasn't implemented that.
02:04:36.000 --> 02:04:41.000
But as I say, if you think you have data which have this property, then you can resort to
02:04:41.000 --> 02:04:48.000
other econometric software-like processes you use, and there you have this trend implemented.
02:04:48.000 --> 02:04:57.000
But empirically, particularly relevant, are cases 2, 3, and 4, which you find in J-multi.
02:04:57.000 --> 02:05:04.000
And to sum this up, the J-multi labels are, for case 2, just a constant, case 3, or for
02:05:04.000 --> 02:05:08.000
the trend, and case 4, just trend.
02:05:08.000 --> 02:05:15.000
The case 3 is called an orthogonal trend because the vector of co-integrating, of matrix of
02:05:15.000 --> 02:05:23.000
co-integrating vectors, beta, is, in this case, just orthogonal to the mu1 component.
02:05:24.000 --> 02:05:32.000
To give you an example here, I look at Gali's work, which you perhaps know from macroeconomics
02:05:32.000 --> 02:05:40.000
lectures, his famous work, which was sort of the starting point, the kick-off paper
02:05:40.000 --> 02:05:51.000
for the New Keynesian Revolution, where Gali looked at the differences of the variables
02:05:51.000 --> 02:05:58.000
changes in labor productivity, YL, GDP over labor, so it's general productivity, starting
02:05:58.000 --> 02:06:05.000
actually in 1948, which is why I have labeled this variable in J-multi in the data set which
02:06:05.000 --> 02:06:11.000
you have in Steenen, YL48, and ours, also starting in 1948.
02:06:11.000 --> 02:06:14.000
Gali did not look at co-integration.
02:06:14.000 --> 02:06:20.000
Gali was just looking at the first differences here, and my critique on Gali's paper is actually
02:06:21.000 --> 02:06:23.000
that he should have looked at co-integration.
02:06:25.000 --> 02:06:31.000
Labor productivity is a trending series, whereas ours, per capita, is not a trending
02:06:31.000 --> 02:06:38.000
series in U.S. data, so since we have a trend here but no trend there, we may only have case 4,
02:06:39.000 --> 02:06:46.000
so trend stationary error correction term, that's the only possible case, so that makes life rather easy.
02:06:47.000 --> 02:06:54.000
What we would do when we would do a co-integration test, which you find in J-multi in the initial
02:06:54.000 --> 02:07:04.000
analysis section, and I have activated now case 4, constant and trend, not case 3, the original,
02:07:04.000 --> 02:07:09.000
the orthogonal trend, not the case without trend, because one of the variables is clearly
02:07:10.000 --> 02:07:19.000
trending. What you have to specify is the lag length, and I have specified here four
02:07:19.000 --> 02:07:25.000
lags in levels, because this is quarterly data, and you can do some analysis with information
02:07:25.000 --> 02:07:30.000
criteria, or you can look at the whiteness of the residuals, and it turns out that four lags in
02:07:30.000 --> 02:07:37.000
levels is quite a good choice. The dimension of the process is 2, because we have two variables,
02:07:37.000 --> 02:07:45.000
ylh48. Now, what you see here are then the hypotheses to be tested,
02:07:46.000 --> 02:07:52.000
namely the hypothesis that the number of co-integrating vectors is zero, no co-integration,
02:07:53.000 --> 02:07:59.000
or the hypothesis that there is at the most one co-integrating vector, which is this hypothesis here.
02:07:59.000 --> 02:08:05.000
This is the R0, which I just had in my formula when I explained to you how the Johansson test works.
02:08:05.000 --> 02:08:17.000
This is the likelihood ratio statistics, 25.96, simply 26, or 11.5 here, and this gives you the
02:08:17.000 --> 02:08:26.000
p-values for this test of an hypothesis. So here the hypothesis is no co-integration,
02:08:26.000 --> 02:08:33.000
no co-integrating vector, no co-integrating vector, and this has a p-value slightly lower than 5%
02:08:34.000 --> 02:08:45.000
as you see. The correct critical values for the likelihood ratio statistics are also given,
02:08:45.000 --> 02:08:49.000
even though you don't really need them when you have the p-value, but you see the likelihood
02:08:49.000 --> 02:09:00.000
ratio statistics 25.96, and the 5% critical value was 25.73, so 0.96 is slightly bigger than 0.73,
02:09:01.000 --> 02:09:04.000
which leads to the result that the p-value is slightly lower than 5%.
02:09:07.000 --> 02:09:12.000
Clearly not significant at the 1% level of significance, where the likelihood ratio
02:09:12.000 --> 02:09:19.000
test statistic would need to be greater than 30. In the second case we have 11.5 as the
02:09:20.000 --> 02:09:28.000
likelihood ratio statistic, and you see the 5% critical value is 12.45, so that's still quite
02:09:28.000 --> 02:09:34.000
a difference, and the p-value is 7%. So we would reject the first hypothesis, saying that there is
02:09:34.000 --> 02:09:40.000
at least one co-integrating vector, but we would accept that there is not more than one co-integrating
02:09:40.000 --> 02:09:45.000
vector. If we accept, if we rejected the second hypothesis, we would say that both variables are
02:09:45.000 --> 02:09:53.000
stationary. It can be the case, but as I said it's not often the case that you find something like
02:09:53.000 --> 02:10:01.000
this. Okay, and here by the way is also the optimal number of legs. It is actually the case
02:10:01.000 --> 02:10:06.000
that three legs and levels would have suffice, perhaps even two, depending on achaika of
02:10:06.000 --> 02:10:11.000
final prediction error being used, or helen quinn being used, so they have different
02:10:13.000 --> 02:10:21.000
recommendations here, so perhaps I had too many legs in here with four. I said it was sufficient
02:10:21.000 --> 02:10:27.000
to render the residual swipe noise, so I have chosen four legs, but now the information criteria
02:10:27.000 --> 02:10:33.000
actually tell me three would have been also enough or even two. And this is why I repeat the test here
02:10:33.000 --> 02:10:42.000
with three legs. Same variables, YL48, H48. Now only three legs and levels, still the
02:10:42.000 --> 02:10:48.000
dimension of the process is two, and what do I find for the first hypothesis, the likelihood ratio
02:10:48.000 --> 02:10:57.000
statistic is 28. Now this is quite a bit bigger than the 95% critical value of 25.7, and you see
02:10:57.000 --> 02:11:08.000
the p-value drops to 2%, 2.4%. And the likelihood ratio statistic for the hypothesis that there is
02:11:08.000 --> 02:11:15.000
at most one co-integrating rate vector is 13.45, and finally this is now significant.
02:11:15.000 --> 02:11:23.000
This is now significant at the 5% level, so we would actually reject that too. I would conclude
02:11:23.000 --> 02:11:32.000
that both variables are stationary. So let's go to two legs, which was the recommendation of the
02:11:32.000 --> 02:11:41.000
Schwartz and the helen quinn criteria. If we use two legs, then we see we have really strong evidence
02:11:41.000 --> 02:11:47.000
now, even at the 1% level, that there is co-integration, or that there is not
02:11:48.000 --> 02:11:53.000
no co-integration. The hypothesis of no co-integration are not even being equal to zero,
02:11:53.000 --> 02:11:59.000
is rejected at the 1% level, because we have a likelihood ratio statistic of 30.8,
02:11:59.000 --> 02:12:06.000
which is greater even than the critical value here at 13.67. The p-value is less than 1%.
02:12:06.000 --> 02:12:13.000
But for the hypothesis that there is at most one co-integrating vector, I'm now again
02:12:13.000 --> 02:12:21.000
there, where I cannot reject this hypothesis at the 5% level. The p-value is 9%, 9.6, almost 10%,
02:12:22.000 --> 02:12:30.000
like your ratio statistic is 10.8, and you see for the 95% level, we would need a higher value,
02:12:31.000 --> 02:12:38.000
12.45, so very clearly we cannot reject here. So the result of our analysis is, with the exception
02:12:38.000 --> 02:12:45.000
of this case of three cases of three legs, we find evidence for one co-integrating vector and evidence
02:12:45.000 --> 02:12:57.000
for co-integration in the sense that variables are in stationary. But little doubt remains,
02:12:57.000 --> 02:13:03.000
because in the case of three legs, you could also argue that perhaps there isn't any unit
02:13:03.000 --> 02:13:15.000
root in the system at all. Yali actually estimated a var in the differences, so DYL48, DH48.
02:13:16.000 --> 02:13:23.000
My warning is that the system he estimated may be misspecified if it is true that there was
02:13:23.000 --> 02:13:27.000
co-integration in there, and then the results he produced would not be reliable.
02:13:29.000 --> 02:13:36.000
According to my reproduction of his results, a VACM representation would have been more appropriate.
02:13:41.000 --> 02:13:47.000
With a linear trend in the co-integrating relationship. Now let's restrict the trend
02:13:47.000 --> 02:13:54.000
on the co-integrating relationship and see how the VACM model looks like. So we have,
02:13:55.000 --> 02:14:06.000
I use again the two different, the two legs levels as the Schwartz and Quinn have suggested,
02:14:07.000 --> 02:14:14.000
and in this case I restrict the trend to be fixed to the error correction term,
02:14:15.000 --> 02:14:23.000
and I fixed the co-integration rank at one in order to derive the VACM representation of the model,
02:14:24.000 --> 02:14:32.000
in which I generally allow an intercept at a trend, but restrict the trend to the error correction
02:14:32.000 --> 02:14:41.000
term. And what I get then here in terms of the final prediction error, in terms of the information
02:14:41.000 --> 02:14:48.000
criterion, is that Akaiike and final prediction error recommend two legs in first differences,
02:14:48.000 --> 02:14:55.000
and Tenet, Quinn and Schwartz recommend one in leg differences. So this is precisely the same
02:14:55.000 --> 02:15:02.000
as we have found before, only that here the search is done in terms of first differences,
02:15:03.000 --> 02:15:10.000
whereas before we had done it in the levels, which explains why the recommendations here
02:15:10.000 --> 02:15:14.000
for one period short of what we have found in the levels. This is just
02:15:14.000 --> 02:15:23.000
from the levels differences transformation. And here is the estimated VACM. Now the estimated
02:15:23.000 --> 02:15:33.000
VACM would give us the co-integrating vector beta prime here. So again we have the growth rates
02:15:33.000 --> 02:15:41.000
here of labor productivity and ours. This is the alpha matrix, the matrix of loading coefficients.
02:15:41.000 --> 02:15:49.000
This is the one co-integrating vector, which we estimate one negative 0.7 on labor productivity
02:15:49.000 --> 02:15:54.000
and ours worked. So it would tell us there's longer equilibrium relationship between labor
02:15:54.000 --> 02:16:02.000
productivity and ours worked. The trend is restricted to the error correction term. So
02:16:02.000 --> 02:16:07.000
you find it in here. And then there are like differences with certain coefficients, which are
02:16:07.000 --> 02:16:15.000
not so interesting. Here with leg one, there with leg two, and there's a constant. So that would be
02:16:15.000 --> 02:16:24.000
the VACM specification. The rest of the slides, two or three more, is just references on this
02:16:24.000 --> 02:16:34.000
type of topics. I'm sorry we just had 12 lectures rather than the usual 14 in this semester,
02:16:34.000 --> 02:16:42.000
which would have left me a little bit more time to explain these rather difficult and technical
02:16:42.000 --> 02:16:48.000
issues with co-integration and vector error correction mechanisms that I have talked about
02:16:48.000 --> 02:16:58.000
in this last lecture. But I explained everything and I hope you'll be able to reproduce what I
02:16:58.000 --> 02:17:05.000
have said at home or to understand what I have said at home because this material which I covered
02:17:05.000 --> 02:17:11.000
in this last lecture is truly important. And therefore it is relevant also for the oral exam.
02:17:12.000 --> 02:17:17.000
So often you may say at the end of the lecture, this is perhaps not the most decisive
02:17:18.000 --> 02:17:23.000
material which you have covered here and you relieve students of preparing for the oral exam.
02:17:23.000 --> 02:17:28.000
I'm not in a position to do that. You need to know this. You need to understand what
02:17:28.000 --> 02:17:34.000
co-integration is and how to test for co-integration and how to represent variables
02:17:34.000 --> 02:17:42.000
which are co-integrated in terms of vector error correction models. I will certainly not go into
02:17:42.000 --> 02:17:49.000
the very technical details of the Johansen approach and the tests for co-integration in
02:17:49.000 --> 02:17:54.000
the oral exam, but I would like to inform you that it is very well possible that I ask you about
02:17:54.000 --> 02:18:00.000
co-integration and about tests for co-integration and about vector error correction models and their
02:18:00.000 --> 02:18:06.000
relationship to various differences in various levels. So please do prepare that for the final
02:18:06.000 --> 02:18:14.000
lecture. I am done and time is over. Do you have any remaining questions?
02:18:18.000 --> 02:18:19.000
Please raise your hand if you do.
02:18:24.000 --> 02:18:31.000
Apparently this is not the case. Then thank you very much for your attention. Good luck for the
02:18:31.000 --> 02:18:39.000
final exam. Good luck for the rest of your studies in the master's program. I assume you
02:18:39.000 --> 02:18:47.000
are already rather advanced in your studies and of course if I do not see you in the final exam
02:18:48.000 --> 02:18:55.000
or the oral exam in the first scheduled exam then have a good semester break and perhaps until
02:18:55.000 --> 02:19:05.000
the second scheduled exam. I hope that you have benefited from this lecture and if you have,
02:19:05.000 --> 02:19:13.000
I encourage you to apply what you have learned here in your own research because time series
02:19:13.000 --> 02:19:20.000
data is a very important source of information, but it needs to be applied with care because
02:19:20.000 --> 02:19:24.000
great errors can be committed if you do not know about the issues I have talked about
02:19:25.000 --> 02:19:33.000
in this lecture of going from spurious regressions up to misspecifications which occur
02:19:33.000 --> 02:19:46.000
in co-integrated systems. So my best wishes for you and so on.