WEBVTT - autoGenerated
00:00:00.000 --> 00:00:09.000
Let me start with the lecture.
00:00:09.000 --> 00:00:17.000
OK, we defined two concepts last lecture, the second of which
00:00:17.000 --> 00:00:20.000
I show here again.
00:00:20.000 --> 00:00:24.000
The definition of a cumulative distribution function.
00:00:24.000 --> 00:00:26.000
By the way, there was a minor error
00:00:26.000 --> 00:00:28.000
on my slide, which I have corrected.
00:00:28.000 --> 00:00:31.000
I think my slide said cumulative distribution function,
00:00:31.000 --> 00:00:33.000
which is incorrect.
00:00:33.000 --> 00:00:37.000
The correct term is cumulative distribution function.
00:00:37.000 --> 00:00:38.000
Sorry for this.
00:00:38.000 --> 00:00:42.000
And by content, of course, it means basically the same thing.
00:00:42.000 --> 00:00:43.000
But as I said, the correct expression
00:00:43.000 --> 00:00:45.000
is cumulative distribution function.
00:00:45.000 --> 00:00:48.000
And we have defined the cumulative distribution
00:00:48.000 --> 00:00:53.000
function as the probability of a random variable capital X
00:00:53.000 --> 00:00:56.000
taking on a value which is less than or equal
00:00:56.000 --> 00:00:58.000
to small x.
00:00:58.000 --> 00:01:02.000
On this probability, we just denote it by f of small x.
00:01:02.000 --> 00:01:06.000
So f of small x indicates exactly this probability.
00:01:06.000 --> 00:01:10.000
And we have already applied this to the concept
00:01:10.000 --> 00:01:13.000
of a discrete random variable.
00:01:13.000 --> 00:01:16.000
You recall the coin tossing experiment
00:01:16.000 --> 00:01:19.000
which we went through.
00:01:19.000 --> 00:01:25.000
And now we will continue where I left off last time,
00:01:25.000 --> 00:01:27.000
namely applying the same concept
00:01:27.000 --> 00:01:31.000
to a continuous random variable.
00:01:31.000 --> 00:01:33.000
The first thing that we have to do
00:01:33.000 --> 00:01:36.000
is to define what does mean that the random variable is
00:01:36.000 --> 00:01:38.000
continuous.
00:01:38.000 --> 00:01:42.000
The definition you see here, a variable, a random variable
00:01:42.000 --> 00:01:45.000
x, is a continuous random variable
00:01:45.000 --> 00:01:49.000
if its cumulative distribution function is continuous.
00:01:49.000 --> 00:01:51.000
Very easy, actually.
00:01:51.000 --> 00:01:54.000
So if the cumulative distribution function,
00:01:54.000 --> 00:01:57.000
that is to say, the probability of capital X
00:01:57.000 --> 00:02:00.000
being less than or equal to small x,
00:02:00.000 --> 00:02:03.000
if this is a function which is continuous in small x,
00:02:03.000 --> 00:02:06.000
then the random variable capital X
00:02:06.000 --> 00:02:11.000
is a continuous random variable.
00:02:11.000 --> 00:02:16.000
It may be surprising to you what follows in the yellow box.
00:02:16.000 --> 00:02:19.000
Hence, a continuous random variable
00:02:19.000 --> 00:02:24.000
takes on any real value with zero probability.
00:02:24.000 --> 00:02:28.000
So that seems perhaps a little counterintuitive.
00:02:28.000 --> 00:02:32.000
But this is exactly the characteristic
00:02:32.000 --> 00:02:35.000
of a continuous random variable, that there
00:02:35.000 --> 00:02:41.000
is not a single value in R which is taken
00:02:41.000 --> 00:02:45.000
on with a positive probability.
00:02:45.000 --> 00:02:51.000
All values in R are taken on just with zero probability.
00:02:51.000 --> 00:02:54.000
And we would like to understand or would
00:02:54.000 --> 00:02:58.000
like to show you, if you don't know that, why this is the case.
00:02:58.000 --> 00:03:02.000
The proof is here, and it is actually very easy.
00:03:02.000 --> 00:03:05.000
We know from the definition of the cumulative distribution
00:03:05.000 --> 00:03:10.000
function that f of x is equal to the probability of capital X
00:03:10.000 --> 00:03:14.000
being less than or equal than small x.
00:03:14.000 --> 00:03:18.000
And analogously, f of x minus epsilon,
00:03:18.000 --> 00:03:25.000
where we think of epsilon as some small positive number,
00:03:25.000 --> 00:03:28.000
is the probability of capital X being less than or equal
00:03:28.000 --> 00:03:31.000
to x minus epsilon.
00:03:31.000 --> 00:03:35.000
So we can conclude that the probability
00:03:35.000 --> 00:03:40.000
that x takes on a specific value small x.
00:03:40.000 --> 00:03:41.000
So capital X, the random variable,
00:03:41.000 --> 00:03:45.000
takes on a specific value small x,
00:03:45.000 --> 00:03:48.000
is exactly equal to this small x,
00:03:48.000 --> 00:03:54.000
is just the limit of the difference between these two
00:03:54.000 --> 00:03:55.000
probabilities.
00:03:55.000 --> 00:03:57.000
So here we have the probability that x
00:03:57.000 --> 00:04:01.000
is less than or equal to x.
00:04:01.000 --> 00:04:06.000
And here we have the probability of capital X
00:04:06.000 --> 00:04:10.000
being less than or equal to x minus epsilon.
00:04:10.000 --> 00:04:14.000
Now if we let the epsilon go to 0,
00:04:14.000 --> 00:04:18.000
so we take the limit with epsilon going to 0,
00:04:18.000 --> 00:04:24.000
then obviously the result is the probability of capital
00:04:24.000 --> 00:04:28.000
X being equal to small x.
00:04:28.000 --> 00:04:32.000
Because here we have exactly this x.
00:04:32.000 --> 00:04:42.000
So here this is the probability of all capital X events which
00:04:42.000 --> 00:04:46.000
occur and have a value of x or something smaller.
00:04:46.000 --> 00:04:50.000
But we subtract the probability of capital X
00:04:50.000 --> 00:04:55.000
taking on a value which is smaller or much smaller than x.
00:04:55.000 --> 00:04:58.000
And we let the difference between x
00:04:58.000 --> 00:05:01.000
and this other value, which is a little smaller than x,
00:05:01.000 --> 00:05:07.000
thus we let go towards 0 so that p of x is equal to x
00:05:07.000 --> 00:05:13.000
is exactly the limit of this difference of probabilities.
00:05:13.000 --> 00:05:15.000
But this probability here is, of course,
00:05:15.000 --> 00:05:20.000
the cumulative distribution function value at x.
00:05:20.000 --> 00:05:21.000
So it's f of x.
00:05:21.000 --> 00:05:26.000
And this value here is f of x minus epsilon.
00:05:26.000 --> 00:05:29.000
Since we know that the cumulative distribution
00:05:29.000 --> 00:05:31.000
function is continuous,
00:05:31.000 --> 00:05:37.000
the limit when epsilon goes to 0 of this difference here
00:05:37.000 --> 00:05:39.000
is just 0.
00:05:39.000 --> 00:05:43.000
And thereby we have proven that the probability of x
00:05:43.000 --> 00:05:47.000
being equal to some completely arbitrary small x
00:05:47.000 --> 00:05:52.000
is 0 for any small x that we may take.
00:05:52.000 --> 00:05:55.000
So continuous random variable never
00:05:55.000 --> 00:06:00.000
has a positive probability for taking on any particular area.
00:06:03.000 --> 00:06:07.000
Now suppose that the CDF of x of random variable
00:06:07.000 --> 00:06:10.000
is differentiable.
00:06:10.000 --> 00:06:13.000
Well, you know from basic math that if a function
00:06:13.000 --> 00:06:16.000
is differentiable, then it is also continuous.
00:06:16.000 --> 00:06:20.000
So in this case that the CDF is differentiable,
00:06:20.000 --> 00:06:23.000
the CDF is continuous.
00:06:23.000 --> 00:06:26.000
And therefore, the random variable capital X
00:06:26.000 --> 00:06:30.000
is a continuous random variable.
00:06:30.000 --> 00:06:35.000
In that case, the probability density function small f of x
00:06:35.000 --> 00:06:38.000
is just the first derivative of the CDF,
00:06:38.000 --> 00:06:41.000
as you probably know from your basic statistics.
00:06:41.000 --> 00:06:46.000
So f of x is just the derivative of capital F of x
00:06:46.000 --> 00:06:49.000
with respect to x.
00:06:49.000 --> 00:06:53.000
Or equivalently, we may write that capital F of x
00:06:53.000 --> 00:07:00.000
is just the integral from minus infinity to x over f of t dt.
00:07:05.000 --> 00:07:11.000
Note that if we have two constants given numbers a and b,
00:07:11.000 --> 00:07:15.000
let's say a is smaller than b without loss of generality,
00:07:15.000 --> 00:07:19.000
then the probability that a continuous random variable x
00:07:19.000 --> 00:07:23.000
lies between a and b is just the difference
00:07:23.000 --> 00:07:27.000
of the values of the cumulative distribution function.
00:07:27.000 --> 00:07:32.000
So the probability of x lying between a and b,
00:07:32.000 --> 00:07:37.000
in this case within the closed interval from a to b,
00:07:37.000 --> 00:07:41.000
is just f of b minus f of a.
00:07:41.000 --> 00:07:45.000
Or again, equivalently, we can express
00:07:45.000 --> 00:07:49.000
this difference between two values of the CDF
00:07:49.000 --> 00:07:52.000
as the area under the probability density
00:07:52.000 --> 00:07:57.000
function small f of x between the points a and b,
00:07:57.000 --> 00:08:00.000
because we know that the probability density function
00:08:00.000 --> 00:08:04.000
is just the derivative of the CDF.
00:08:04.000 --> 00:08:09.000
Here is a diagram from the Woodridge textbook.
00:08:09.000 --> 00:08:12.000
When you see this very clearly, you
00:08:12.000 --> 00:08:14.000
have here the probability density function
00:08:14.000 --> 00:08:17.000
small f of x for some random variable.
00:08:17.000 --> 00:08:20.000
It's not a symmetric PDF, need not be.
00:08:20.000 --> 00:08:25.000
And we evaluate this at two points a and b,
00:08:25.000 --> 00:08:28.000
with a being smaller than b.
00:08:28.000 --> 00:08:30.000
Then we know that the probability
00:08:30.000 --> 00:08:36.000
that x lies between a and b is just the shaded area
00:08:36.000 --> 00:08:40.000
under the PDF.
00:08:40.000 --> 00:08:42.000
Or if we want to integrate, then of course,
00:08:42.000 --> 00:08:48.000
it would be capital F of b minus capital F of a,
00:08:48.000 --> 00:08:54.000
precisely because the PDF is the derivative of the CDF.
00:08:58.000 --> 00:09:01.000
Now let's collect some important properties of CDF,
00:09:01.000 --> 00:09:05.000
properties which do not necessarily require
00:09:05.000 --> 00:09:08.000
that the random variable is continuous.
00:09:08.000 --> 00:09:11.000
So if there is a requirement that the random variable
00:09:11.000 --> 00:09:14.000
is continuous, I will emphasize this separately.
00:09:14.000 --> 00:09:18.000
But what I collect here first are very general properties
00:09:18.000 --> 00:09:21.000
of CDFs, property number one.
00:09:21.000 --> 00:09:24.000
If we are given any number c, then the probability
00:09:24.000 --> 00:09:29.000
of x being greater than c is obviously just 1
00:09:29.000 --> 00:09:34.000
minus the probability of x being less than or equal to c.
00:09:34.000 --> 00:09:39.000
And this probability here is just 1 minus the CDF at point
00:09:39.000 --> 00:09:41.000
c.
00:09:41.000 --> 00:09:43.000
So when you are asked for probability
00:09:43.000 --> 00:09:47.000
of a random variable taking on a larger value than some given
00:09:47.000 --> 00:09:52.000
value c, you can just use the complement of the CDF
00:09:52.000 --> 00:09:57.000
and then be 1 minus f of c.
00:09:57.000 --> 00:10:01.000
Moreover, for any numbers a and b with a less than
00:10:01.000 --> 00:10:04.000
or equal to b, we have that the probability of a
00:10:04.000 --> 00:10:09.000
being smaller than x and this being less than or equal to b
00:10:09.000 --> 00:10:15.000
is just f of b minus f of a.
00:10:15.000 --> 00:10:17.000
And for continuous random variables,
00:10:17.000 --> 00:10:22.000
we moreover have that the probability of x being either
00:10:22.000 --> 00:10:29.000
greater than or equal to c or x being less than or equal to c
00:10:29.000 --> 00:10:31.000
is just the same as the probability
00:10:31.000 --> 00:10:36.000
of x being greater than c or x being smaller than c
00:10:36.000 --> 00:10:40.000
because the equal sign here is always associated
00:10:40.000 --> 00:10:42.000
with the probability of 0.
00:10:42.000 --> 00:10:47.000
So since the probability of any particular value
00:10:47.000 --> 00:10:50.000
being taken on by the random variable capital X is always
00:10:50.000 --> 00:10:54.000
0, we've proven that just a minute ago,
00:10:54.000 --> 00:10:59.000
we do not need to distinguish between the weak inequalities
00:10:59.000 --> 00:11:02.000
here or the strong inequality here.
00:11:02.000 --> 00:11:06.000
The probability of both events is precisely the same.
00:11:06.000 --> 00:11:10.000
And that holds both for greater as it holds for smaller.
00:11:16.000 --> 00:11:23.000
We now proceed to introduce the concept of a joint PDF
00:11:23.000 --> 00:11:26.000
and of a marginal PDF.
00:11:26.000 --> 00:11:30.000
We do this for the discrete case first.
00:11:30.000 --> 00:11:33.000
So suppose you have a discrete random variable.
00:11:33.000 --> 00:11:36.000
Again, think of the coin tossing example.
00:11:36.000 --> 00:11:39.000
Think of a random variable, which
00:11:39.000 --> 00:11:43.000
can take on a finite number of values.
00:11:43.000 --> 00:11:47.000
Each of these values with some well-defined probability,
00:11:47.000 --> 00:11:49.000
which is greater than or equal to 0.
00:11:49.000 --> 00:11:52.000
But in many cases, it is greater than 0
00:11:52.000 --> 00:11:56.000
because the sum of all the probabilities must be 1.
00:11:56.000 --> 00:11:57.000
So that's a discrete random variable
00:11:57.000 --> 00:12:01.000
like we covered in the last lesson.
00:12:01.000 --> 00:12:05.000
If we have a joint probability distribution
00:12:05.000 --> 00:12:10.000
for two discrete random variables, which we call X and Y,
00:12:10.000 --> 00:12:12.000
then this joint probability distribution
00:12:12.000 --> 00:12:15.000
consists of two things, namely a listing
00:12:15.000 --> 00:12:19.000
of all the possible combinations of the values
00:12:19.000 --> 00:12:22.000
which the two variables can take.
00:12:22.000 --> 00:12:26.000
And second, the joint probabilities
00:12:26.000 --> 00:12:30.000
associated with each such combination.
00:12:30.000 --> 00:12:33.000
So we would combine all possible values
00:12:33.000 --> 00:12:36.000
of X, all the small x's.
00:12:36.000 --> 00:12:39.000
With all possible values, the variable Y can take on.
00:12:39.000 --> 00:12:43.000
So with all small y's, which then
00:12:43.000 --> 00:12:46.000
gives a great number of possible combinations of X and Y
00:12:46.000 --> 00:12:51.000
values, which can be taken on by X and Y jointly.
00:12:51.000 --> 00:12:53.000
And then we have to know the joint probabilities.
00:12:53.000 --> 00:12:59.000
So the probabilities of X taking on small x
00:12:59.000 --> 00:13:05.000
and Y taking on small y for any combination of small x
00:13:05.000 --> 00:13:06.000
and small y.
00:13:06.000 --> 00:13:08.000
So we need to have a great number of probabilities
00:13:08.000 --> 00:13:11.000
in most cases.
00:13:11.000 --> 00:13:14.000
That would describe the joint probability distribution
00:13:14.000 --> 00:13:16.000
of two random variables.
00:13:16.000 --> 00:13:18.000
And the concept naturally generalizes
00:13:18.000 --> 00:13:21.000
for a joint probability distribution of more
00:13:21.000 --> 00:13:25.000
and two random variables.
00:13:25.000 --> 00:13:29.000
We denote such a joint probability density function
00:13:29.000 --> 00:13:36.000
by the symbol f index capital X comma capital Y.
00:13:36.000 --> 00:13:42.000
Evaluate it at some point small x, small y.
00:13:42.000 --> 00:13:45.000
So the joint probability density function
00:13:45.000 --> 00:13:50.000
would be the probability that X takes on value small x
00:13:50.000 --> 00:13:55.000
and Y takes on value small y, where small x and small y
00:13:55.000 --> 00:13:59.000
are just elements from the list of possible values
00:13:59.000 --> 00:14:03.000
the random variables capital X and capital Y can take.
00:14:03.000 --> 00:14:06.000
And here we have one particular combination
00:14:06.000 --> 00:14:11.000
of the outcomes of these two random variables.
00:14:11.000 --> 00:14:15.000
Obviously, the sum of all the probabilities in a joint
00:14:15.000 --> 00:14:20.000
discrete distribution is always equal to unity.
00:14:20.000 --> 00:14:22.000
So if I sum all the probabilities
00:14:22.000 --> 00:14:28.000
for all possible joint events of x and y taking certain values
00:14:28.000 --> 00:14:31.000
x and y, then the result of the sum
00:14:31.000 --> 00:14:35.000
must be the total probability namely 1.
00:14:39.000 --> 00:14:43.000
From this concept of a joint probability distribution,
00:14:43.000 --> 00:14:48.000
we move on to what we call a marginal probability density
00:14:48.000 --> 00:14:51.000
function.
00:14:51.000 --> 00:14:52.000
The definition is as follows.
00:14:52.000 --> 00:14:56.000
If x and y are jointly distributed,
00:14:56.000 --> 00:15:01.000
the probability density functions f index x and f index
00:15:01.000 --> 00:15:06.000
y are called the marginal probability density function,
00:15:06.000 --> 00:15:11.000
where f index x of x is the probability
00:15:11.000 --> 00:15:17.000
that x takes on value small x, regardless
00:15:17.000 --> 00:15:22.000
of which value the random variable y takes.
00:15:22.000 --> 00:15:27.000
And similarly, f index y of small y
00:15:27.000 --> 00:15:31.000
is the probability that the random variable capital Y
00:15:31.000 --> 00:15:34.000
assumes value small y, regardless
00:15:34.000 --> 00:15:41.000
of which value the random variable capital X takes on.
00:15:41.000 --> 00:15:47.000
So these expressions fx of small x and fy of small y
00:15:47.000 --> 00:15:51.000
are called the marginal probability density functions
00:15:51.000 --> 00:15:54.000
of a joint probability distribution of two random
00:15:54.000 --> 00:15:57.000
variables x and y.
00:16:00.000 --> 00:16:07.000
Clearly, it must be the case that f index x at point x
00:16:07.000 --> 00:16:11.000
is just the sum over the probabilities
00:16:11.000 --> 00:16:19.000
for the joint distribution, where we sum f of x and y
00:16:19.000 --> 00:16:25.000
j over all possible values that variable y can take.
00:16:25.000 --> 00:16:29.000
So we let the index j here run from minus infinity
00:16:29.000 --> 00:16:32.000
to plus infinity, so over all possible j's,
00:16:32.000 --> 00:16:37.000
and thereby we sum over all possible values
00:16:37.000 --> 00:16:42.000
that the random variable capital Y may take
00:16:42.000 --> 00:16:47.000
for some value of x for which we want
00:16:47.000 --> 00:16:49.000
to compute the marginal probability density
00:16:49.000 --> 00:16:50.000
function at point x.
00:16:55.000 --> 00:16:58.000
I hope that this way of introducing
00:16:58.000 --> 00:17:01.000
the marginal probability density function
00:17:01.000 --> 00:17:05.000
is intuitive, because this immediately
00:17:05.000 --> 00:17:09.000
allows you to infer how the marginal probability density
00:17:09.000 --> 00:17:15.000
function is defined in the case of a continuous random variable.
00:17:15.000 --> 00:17:17.000
The difference between discrete variables
00:17:17.000 --> 00:17:20.000
and continuous variables basically
00:17:20.000 --> 00:17:24.000
being that in the discrete case, a random variable
00:17:24.000 --> 00:17:30.000
can take on only a finite number of real values,
00:17:30.000 --> 00:17:34.000
whereas in the case of a continuous random variable,
00:17:34.000 --> 00:17:41.000
each random variable can take on infinitely many values
00:17:41.000 --> 00:17:47.000
of a discrete of the infinitely many real numbers,
00:17:47.000 --> 00:17:50.000
can basically take on any real number,
00:17:50.000 --> 00:17:54.000
or at least can take on real numbers over intervals,
00:17:54.000 --> 00:17:57.000
where any real number within this interval
00:17:57.000 --> 00:18:02.000
can be assumed by the random variable.
00:18:02.000 --> 00:18:11.000
And it is not even ensured that there are just
00:18:11.000 --> 00:18:14.000
countably many numbers x and y.
00:18:14.000 --> 00:18:16.000
No, it's not the case.
00:18:16.000 --> 00:18:20.000
There may be more than countably many numbers,
00:18:20.000 --> 00:18:24.000
since we live here, or we define these random variables
00:18:24.000 --> 00:18:27.000
in the space of real numbers of which there
00:18:27.000 --> 00:18:30.000
are more than countably many.
00:18:30.000 --> 00:18:36.000
So the marginal PDF of a continuous random variable
00:18:36.000 --> 00:18:43.000
is just the continuous analog of the discrete marginal PDF.
00:18:43.000 --> 00:18:47.000
We just replace the sum sign by an integral,
00:18:47.000 --> 00:18:52.000
because the integral is just the analog of summing something
00:18:52.000 --> 00:18:54.000
in the continuous word.
00:18:54.000 --> 00:18:58.000
Actually, sum is a word which begins with an s,
00:18:58.000 --> 00:19:06.000
and this is why we use a Greek s sigma as the symbol for a sum.
00:19:06.000 --> 00:19:11.000
And the integral signs actually also just a stylized s,
00:19:11.000 --> 00:19:14.000
which was invented by Leibniz in order
00:19:14.000 --> 00:19:20.000
to denote summing on a continuum,
00:19:20.000 --> 00:19:25.000
where there are more than countably many numbers to count.
00:19:25.000 --> 00:19:27.000
So that integral sign, you can just
00:19:27.000 --> 00:19:31.000
read as a summation, just the same way
00:19:31.000 --> 00:19:34.000
as you can read the sigma sign, as a summation,
00:19:34.000 --> 00:19:37.000
with the only difference being that here we
00:19:37.000 --> 00:19:40.000
have finitely many elements to sum,
00:19:40.000 --> 00:19:43.000
and here we have infinitely many and more than countably
00:19:43.000 --> 00:19:46.000
many elements to sum.
00:19:46.000 --> 00:19:54.000
So we sum all the values f x, y at points x and y
00:19:54.000 --> 00:20:01.000
over all possible y's, holding x constant in this integral.
00:20:01.000 --> 00:20:06.000
This is why the value of the marginal PDF still
00:20:06.000 --> 00:20:09.000
depends on x, since we hold the x constant here,
00:20:09.000 --> 00:20:12.000
but we sum over the y's.
00:20:12.000 --> 00:20:14.000
And here we have also summed over the y's
00:20:14.000 --> 00:20:18.000
by letting the index j go over all possible values
00:20:18.000 --> 00:20:22.000
of an index, thereby ensuring that we cover
00:20:22.000 --> 00:20:27.000
all possible values of y holding x constant.
00:20:27.000 --> 00:20:28.000
So this is completely analogous what
00:20:28.000 --> 00:20:33.000
we do here in the discrete world and in the continuous world.
00:20:37.000 --> 00:20:41.000
Now we have covered the joint distribution
00:20:41.000 --> 00:20:43.000
and the marginal distribution.
00:20:43.000 --> 00:20:48.000
It remains to talk about conditional distributions.
00:20:48.000 --> 00:20:51.000
Conditional distributions are of particular interest
00:20:51.000 --> 00:20:54.000
to economists because they open the door
00:20:54.000 --> 00:20:59.000
for doing causal analysis, as you will see in a minute.
00:20:59.000 --> 00:21:05.000
Obviously, economists are always interested in causal analysis.
00:21:05.000 --> 00:21:07.000
So suppose that we are interested
00:21:07.000 --> 00:21:12.000
in the probability of a random variable capital Y,
00:21:12.000 --> 00:21:16.000
assuming a certain value, small y,
00:21:16.000 --> 00:21:19.000
given that another random variable x takes
00:21:19.000 --> 00:21:23.000
on a specific value, small x.
00:21:23.000 --> 00:21:27.000
That is a conditional probability.
00:21:27.000 --> 00:21:30.000
And having conditional probabilities,
00:21:30.000 --> 00:21:33.000
we can define conditional distributions.
00:21:33.000 --> 00:21:36.000
For instance, a conditional probability density function,
00:21:36.000 --> 00:21:41.000
conditional PDF rather than a marginal PDF.
00:21:41.000 --> 00:21:46.000
The conditional PDF is defined as f index.
00:21:46.000 --> 00:21:54.000
And you have to read this now as y given x at some point y,
00:21:54.000 --> 00:22:02.000
given some point small x, which random variable x has assumed.
00:22:02.000 --> 00:22:07.000
So this thing here denotes the probability
00:22:07.000 --> 00:22:18.000
of the random variable y given, or I should correct myself here,
00:22:18.000 --> 00:22:21.000
or say more precisely, this is the probability
00:22:21.000 --> 00:22:28.000
of the random variable y taking on value small y,
00:22:28.000 --> 00:22:31.000
so the real number small y, given
00:22:31.000 --> 00:22:37.000
that the random variable x has taken on the real value,
00:22:37.000 --> 00:22:41.000
the real number small x.
00:22:41.000 --> 00:22:47.000
This conditional probability, or this conditional density,
00:22:47.000 --> 00:22:50.000
is not so, it is not quite correct to speak
00:22:50.000 --> 00:22:53.000
of a probability here, because in the case
00:22:53.000 --> 00:22:57.000
of a continuous random variable, the probability, as we have seen,
00:22:57.000 --> 00:22:58.000
is always 0.
00:22:58.000 --> 00:23:00.000
So we can just speak of a density.
00:23:00.000 --> 00:23:03.000
So it's actually not a probability,
00:23:03.000 --> 00:23:04.000
or it would just be a probability
00:23:04.000 --> 00:23:07.000
in the discrete world, but it would not
00:23:07.000 --> 00:23:10.000
be a probability in the continuous world.
00:23:10.000 --> 00:23:14.000
In the continuous world, we have to speak of density here.
00:23:14.000 --> 00:23:17.000
This conditional density here is defined
00:23:17.000 --> 00:23:24.000
to be the ratio between the unconditional joint density,
00:23:24.000 --> 00:23:28.000
f for the joint distribution of random variables x and y
00:23:28.000 --> 00:23:34.000
at point small x and small y, divided
00:23:34.000 --> 00:23:40.000
by the marginal density of x at the point x, which is given.
00:23:44.000 --> 00:23:56.000
If we define this concept here for all x, where f of x is
00:23:56.000 --> 00:24:01.000
positive, then we have defined the conditional probability
00:24:01.000 --> 00:24:02.000
density function.
00:24:05.000 --> 00:24:08.000
This probability density function
00:24:08.000 --> 00:24:13.000
can, of course, only be defined at those points
00:24:13.000 --> 00:24:18.000
where the marginal density of x is positive,
00:24:18.000 --> 00:24:22.000
because wherever the marginal density is 0,
00:24:22.000 --> 00:24:25.000
we would divide by 0 here, and that is not allowed.
00:24:25.000 --> 00:24:28.000
So then the density, the conditional probability
00:24:28.000 --> 00:24:31.000
density would not be well defined.
00:24:31.000 --> 00:24:35.000
But for positive values of the marginal PDF,
00:24:35.000 --> 00:24:38.000
we can define the conditional PDF.
00:24:38.000 --> 00:24:41.000
And the definition is then that we
00:24:41.000 --> 00:24:50.000
compute the density of y given x as the unconditional density
00:24:50.000 --> 00:24:54.000
of x and y, of the variables x and y,
00:24:54.000 --> 00:24:59.000
taking on specific values small x and small y,
00:24:59.000 --> 00:25:04.000
and divide this by the marginal density,
00:25:04.000 --> 00:25:08.000
if you want to get an intuitive understanding of it,
00:25:08.000 --> 00:25:16.000
by the probability that x takes on values small x,
00:25:16.000 --> 00:25:21.000
regardless of what happens to y.
00:25:21.000 --> 00:25:25.000
So the conditional density is typically
00:25:25.000 --> 00:25:29.000
larger than the unconditional joint density,
00:25:29.000 --> 00:25:32.000
because no, this is not quite right.
00:25:32.000 --> 00:25:34.000
If it were a probability, let's say it this way,
00:25:34.000 --> 00:25:36.000
if this f here were a probability,
00:25:36.000 --> 00:25:41.000
if we are in the discrete world, then the conditional
00:25:41.000 --> 00:25:44.000
probability would typically be larger
00:25:44.000 --> 00:25:47.000
than the unconditional joint probability,
00:25:47.000 --> 00:25:51.000
because we divide by the marginal probability
00:25:51.000 --> 00:25:55.000
and the marginal probability smaller than 1, typically.
00:25:55.000 --> 00:25:57.000
So we divide, typically, by something
00:25:57.000 --> 00:25:59.000
which is smaller than 1.
00:25:59.000 --> 00:26:05.000
We ignore certain special cases, typically smaller than 1.
00:26:05.000 --> 00:26:12.000
So we increase the value of f xy of the joint probability
00:26:12.000 --> 00:26:16.000
by dividing it through a probability which
00:26:16.000 --> 00:26:17.000
is smaller than 1.
00:26:17.000 --> 00:26:20.000
Therefore, the probability in the discrete world
00:26:20.000 --> 00:26:27.000
for the conditional probability is typically
00:26:27.000 --> 00:26:32.000
greater than joint probability, which
00:26:32.000 --> 00:26:37.000
is in the numerator of the structure here.
00:26:37.000 --> 00:26:42.000
Obviously, this does not hold in the continuous world.
00:26:42.000 --> 00:26:44.000
So if we have a continuous random variable,
00:26:44.000 --> 00:26:48.000
or actually both x and y are continuous random variables,
00:26:48.000 --> 00:26:51.000
then clearly, we divide two density values here.
00:26:51.000 --> 00:26:54.000
And it is quite possible that f of x
00:26:54.000 --> 00:26:57.000
has a density which is greater than 1.
00:26:57.000 --> 00:26:58.000
So actually, the conditional density
00:26:58.000 --> 00:27:01.000
can be smaller than the joint.
00:27:01.000 --> 00:27:07.000
So this is not analogous to the discrete random variable.
00:27:07.000 --> 00:27:10.000
From this definition, we immediately
00:27:10.000 --> 00:27:14.000
have two quite important implications,
00:27:14.000 --> 00:27:16.000
namely just by turning the equation around
00:27:16.000 --> 00:27:21.000
in the previous equation, basically by just multiplying
00:27:21.000 --> 00:27:26.000
with f of x on both sides, we get
00:27:26.000 --> 00:27:34.000
that the joint density is equal to the conditional density of y
00:27:34.000 --> 00:27:39.000
given x times the marginal density of x.
00:27:39.000 --> 00:27:46.000
And similarly, this is equal to the conditional density of x
00:27:46.000 --> 00:27:52.000
given y, at point x given y, small x, small y,
00:27:52.000 --> 00:27:58.000
times the marginal density of y taking on value small y.
00:27:58.000 --> 00:28:03.000
If we have the case of a discrete random variable,
00:28:03.000 --> 00:28:07.000
which as I just told you is slightly different from the case
00:28:07.000 --> 00:28:10.000
of a continuous random variable, we
00:28:10.000 --> 00:28:15.000
know that the conditional density fy given x
00:28:15.000 --> 00:28:19.000
is the same thing as the probability that y is equal
00:28:19.000 --> 00:28:25.000
to small y given that x is equal to x times the marginal density
00:28:25.000 --> 00:28:32.000
of y given that x is equal to small x.
00:28:32.000 --> 00:28:36.000
So this thing here, we call the conditional probability
00:28:36.000 --> 00:28:38.000
of a certain event, namely the event that y
00:28:38.000 --> 00:28:41.000
is equal to small y given that x is equal to x.
00:28:44.000 --> 00:28:49.000
This may seem complicated if you have not heard this before.
00:28:49.000 --> 00:28:51.000
But I suppose you have heard all of that
00:28:51.000 --> 00:28:54.000
in your statistics lectures in a graduate study.
00:28:54.000 --> 00:28:58.000
So it is just a recap here, refreshment,
00:28:58.000 --> 00:29:01.000
a rewarming of stuff which you have heard
00:29:01.000 --> 00:29:04.000
and which you have perhaps not made much use of
00:29:04.000 --> 00:29:07.000
in your previous education.
00:29:07.000 --> 00:29:09.000
But in this particular lecture, we
00:29:09.000 --> 00:29:12.000
will actually make quite a bit of use of these type of things.
00:29:12.000 --> 00:29:15.000
Because as I already told you in the first lecture,
00:29:15.000 --> 00:29:19.000
we will do quite a bit of econometric analysis
00:29:19.000 --> 00:29:20.000
of causal relations.
00:29:20.000 --> 00:29:23.000
We will actually think about what kind
00:29:23.000 --> 00:29:26.000
of econometric methods can we use in order
00:29:26.000 --> 00:29:31.000
to establish causality rather than correlation.
00:29:31.000 --> 00:29:35.000
Typically, econometrics establishes just correlation.
00:29:35.000 --> 00:29:38.000
But the questions being asked by economists
00:29:38.000 --> 00:29:41.000
are mostly questions of causality and not
00:29:41.000 --> 00:29:44.000
questions of correlation.
00:29:44.000 --> 00:29:48.000
So the economic questions we ask are often,
00:29:48.000 --> 00:29:52.000
what is the effect of x on y?
00:29:52.000 --> 00:29:54.000
What, for instance, is the effect
00:29:54.000 --> 00:29:59.000
of increasing government expenditures on income?
00:29:59.000 --> 00:30:05.000
Or what is the effect of increasing wages on occupation?
00:30:05.000 --> 00:30:10.000
Or what is the effect of education
00:30:10.000 --> 00:30:13.000
on human capital formation?
00:30:13.000 --> 00:30:16.000
These are economic questions which
00:30:16.000 --> 00:30:20.000
aim at causal relationships.
00:30:20.000 --> 00:30:22.000
We are not interested in measuring
00:30:22.000 --> 00:30:25.000
certain correlations, which often are in our data
00:30:25.000 --> 00:30:30.000
just because some variables tend to grow over time,
00:30:30.000 --> 00:30:34.000
even though the growth is not related
00:30:34.000 --> 00:30:36.000
to each other in some type of causal way
00:30:36.000 --> 00:30:39.000
or maybe attributed perhaps to some third variable.
00:30:39.000 --> 00:30:42.000
So we're not at all interested in correlations in most cases,
00:30:42.000 --> 00:30:45.000
but we are interested in causality.
00:30:46.000 --> 00:30:49.000
The conditional distribution or the conditional density
00:30:49.000 --> 00:30:54.000
function opens the door for causal analysis
00:30:54.000 --> 00:31:03.000
because such a conditional PDF of y given x in some way
00:31:03.000 --> 00:31:07.000
is helpful in determining how would y
00:31:07.000 --> 00:31:12.000
change if x changes.
00:31:12.000 --> 00:31:17.000
So the conditional PDF tells us how a change in x
00:31:17.000 --> 00:31:23.000
affects the likelihood of observing y.
00:31:23.000 --> 00:31:27.000
This is not yet a foolproof way to establish causality.
00:31:27.000 --> 00:31:32.000
And actually, in some cases, it may just
00:31:32.000 --> 00:31:36.000
be due to correlations or even reverse causality.
00:31:36.000 --> 00:31:40.000
So we cannot take it for granted that the conditional PDF
00:31:40.000 --> 00:31:44.000
already tells us how to do causal analysis.
00:31:44.000 --> 00:31:46.000
We will have to do much more work on that,
00:31:46.000 --> 00:31:48.000
and we'll do it later on.
00:31:48.000 --> 00:31:52.000
But it is a key building block for answering causal questions,
00:31:52.000 --> 00:31:56.000
the concept of a conditional distribution
00:31:56.000 --> 00:31:58.000
or of a conditional probability.
00:31:58.000 --> 00:32:00.000
And as I said, we will make much use of it
00:32:00.000 --> 00:32:03.000
in later parts of this lecture.
00:32:03.000 --> 00:32:07.000
So please make sure that when we come
00:32:07.000 --> 00:32:10.000
to these parts of the lecture, actually
00:32:10.000 --> 00:32:11.000
we start with this type of analysis
00:32:11.000 --> 00:32:14.000
in chapter 4 of this lecture.
00:32:14.000 --> 00:32:17.000
So after finishing with the review of probability
00:32:17.000 --> 00:32:20.000
where we are at right now, and then the review of statistics,
00:32:20.000 --> 00:32:23.000
which comes next, and then the very lengthy review
00:32:23.000 --> 00:32:27.000
of basic econometrics, which is chapter 3, only then
00:32:27.000 --> 00:32:30.000
will we return to the concept of conditional distributions.
00:32:30.000 --> 00:32:34.000
Or actually, we'll make some use of it already
00:32:34.000 --> 00:32:37.000
in the basic econometrics part where
00:32:37.000 --> 00:32:40.000
we talk about expectations, unbiased estimators
00:32:40.000 --> 00:32:42.000
in these kind of things.
00:32:42.000 --> 00:32:46.000
But mostly we'll talk about this concept
00:32:46.000 --> 00:32:49.000
in the causal analysis, which starts in chapter 4
00:32:49.000 --> 00:32:54.000
and proceeds then through much of the rest of the lecture.
00:32:54.000 --> 00:32:57.000
So please familiarize yourself again with this concept
00:32:57.000 --> 00:33:00.000
if you aren't familiar with it anymore.
00:33:00.000 --> 00:33:02.000
And make sure that you have it ready in your mind
00:33:02.000 --> 00:33:06.000
when we come to using this concept in particular
00:33:06.000 --> 00:33:10.000
in chapter 4 and the following chapters of this lecture.
00:33:14.000 --> 00:33:18.000
Now, one more thing to note, I have not yet
00:33:18.000 --> 00:33:21.000
introduced the concept of two random variables being
00:33:21.000 --> 00:33:22.000
independent.
00:33:22.000 --> 00:33:25.000
But I assume you know that such concept exists,
00:33:25.000 --> 00:33:27.000
and probably you also know what it means.
00:33:27.000 --> 00:33:31.000
So here I just suppose that you are
00:33:31.000 --> 00:33:32.000
familiar with this concept.
00:33:32.000 --> 00:33:37.000
And a little later, I will actually formally define it.
00:33:37.000 --> 00:33:43.000
Suppose we have x and y as independent random variables.
00:33:43.000 --> 00:33:45.000
Then the knowledge of the value of x,
00:33:45.000 --> 00:33:51.000
so the knowledge which value random variable x has taken,
00:33:51.000 --> 00:33:53.000
tells us nothing about the probability
00:33:53.000 --> 00:33:58.000
that y takes on certain values, various values.
00:33:58.000 --> 00:34:02.000
And the reverse case is also true.
00:34:02.000 --> 00:34:06.000
So in the case of x and y being independent random variables,
00:34:06.000 --> 00:34:09.000
we would have that the conditional density f of y
00:34:09.000 --> 00:34:16.000
given x at some point y given x is equal to the marginal
00:34:16.000 --> 00:34:20.000
density f y at a point small y.
00:34:20.000 --> 00:34:25.000
And similarly, the conditional density for x given some value
00:34:25.000 --> 00:34:29.000
of y would be equal to the marginal density of x,
00:34:29.000 --> 00:34:33.000
or this particular value small x as we have it here.
00:34:33.000 --> 00:34:37.000
Because we would know since the random variables are
00:34:37.000 --> 00:34:40.000
independent, that this value y, for instance,
00:34:40.000 --> 00:34:44.000
is completely uninformative about the question which
00:34:44.000 --> 00:34:48.000
value of x will be taken by the random variable capital
00:34:48.000 --> 00:34:52.000
x with what kind of likelihood, with what kind of probability,
00:34:52.000 --> 00:34:53.000
what kind of density.
00:34:53.000 --> 00:34:56.000
So y is completely uninformative.
00:34:56.000 --> 00:34:57.000
We just don't need it.
00:34:57.000 --> 00:35:00.000
And we can equally well consider immediately
00:35:00.000 --> 00:35:02.000
the marginal density.
00:35:02.000 --> 00:35:07.000
So it's true both for f of x given y and for f of y given x.
00:35:07.000 --> 00:35:11.000
It's completely arbitrary which of these concepts
00:35:11.000 --> 00:35:13.000
you look at here.
00:35:15.000 --> 00:35:21.000
This in turn implies then that if x and y are independent,
00:35:21.000 --> 00:35:26.000
that the joint density of x and y,
00:35:26.000 --> 00:35:29.000
which we have denoted by this notation here,
00:35:29.000 --> 00:35:34.000
is equal to the product of the marginal densities.
00:35:34.000 --> 00:35:36.000
And it's easy to show that.
00:35:36.000 --> 00:35:41.000
You just have to go back a couple of slides to see this.
00:35:41.000 --> 00:35:48.000
The joint density we had defined here,
00:35:48.000 --> 00:35:51.000
the joint density we had set in relation
00:35:51.000 --> 00:36:01.000
with the conditional density by saying that f of x and y,
00:36:01.000 --> 00:36:05.000
so the joint density, is equal to the conditional density
00:36:05.000 --> 00:36:07.000
times the marginal density of x.
00:36:07.000 --> 00:36:12.000
So conditional density given x times the marginal density
00:36:12.000 --> 00:36:13.000
of x.
00:36:13.000 --> 00:36:16.000
So if you keep this in your mind that the joint density
00:36:16.000 --> 00:36:19.000
is equal to the product of the conditional density times
00:36:19.000 --> 00:36:21.000
the marginal density of x.
00:36:21.000 --> 00:36:24.000
Then going back here, you just see
00:36:24.000 --> 00:36:27.000
that the conditional density in the case of independent random
00:36:27.000 --> 00:36:30.000
variables is equal to the marginal density of y.
00:36:30.000 --> 00:36:37.000
So we multiply not just we don't multiply the conditional density
00:36:37.000 --> 00:36:41.000
with the marginal density of x, but we multiply similarly
00:36:41.000 --> 00:36:45.000
the marginal density of y with the marginal density of x.
00:36:45.000 --> 00:36:47.000
And this gives us this relationship here
00:36:47.000 --> 00:36:51.000
in the case of independent random variables x and y.
00:36:54.000 --> 00:37:00.000
Are there any questions for this stuff I have presented here?
00:37:00.000 --> 00:37:03.000
This was certainly already a little more difficult
00:37:03.000 --> 00:37:05.000
than what I have presented last time.
00:37:05.000 --> 00:37:12.000
So we are increasing the level of instruction here
00:37:12.000 --> 00:37:16.000
and come to more advanced levels later.
00:37:16.000 --> 00:37:18.000
But we are perhaps on some type of medium scale now.
00:37:18.000 --> 00:37:21.000
If there are any questions, please send me
00:37:21.000 --> 00:37:25.000
a quick yes by means of chat function.
00:37:25.000 --> 00:37:28.000
And if there is no or raise your hand,
00:37:28.000 --> 00:37:31.000
don't know how you can do that.
00:37:31.000 --> 00:37:35.000
But if there is nothing and I don't see anything now,
00:37:35.000 --> 00:37:40.000
then I will just continue with section 1.2
00:37:40.000 --> 00:37:43.000
of this set of slides.
00:37:43.000 --> 00:37:48.000
In section 1.2, we will talk about features
00:37:48.000 --> 00:37:51.000
of probability distributions.
00:37:51.000 --> 00:37:53.000
Because very often, we are not really interested
00:37:53.000 --> 00:37:57.000
in the complete probability distribution
00:37:57.000 --> 00:38:00.000
of a random variable or a pair of random variables
00:38:00.000 --> 00:38:03.000
or even more than just a pair of random variables.
00:38:03.000 --> 00:38:05.000
We are very often only interested
00:38:05.000 --> 00:38:08.000
in selected aspects of the distributions
00:38:08.000 --> 00:38:12.000
of random variables, like, for instance, the expected value,
00:38:12.000 --> 00:38:18.000
which is one narrow aspect of the distribution of a random variable.
00:38:18.000 --> 00:38:22.000
Or we are interested in the variance of a random variable.
00:38:22.000 --> 00:38:25.000
The variance is basically equivalent to the standard
00:38:25.000 --> 00:38:25.000
deviation.
00:38:25.000 --> 00:38:27.000
As you know, the standard deviation
00:38:27.000 --> 00:38:29.000
is just the square root of the variance.
00:38:29.000 --> 00:38:33.000
So we may be interested in the degree of variability
00:38:33.000 --> 00:38:36.000
of a random variable.
00:38:36.000 --> 00:38:38.000
Or we may be interested in correlations,
00:38:38.000 --> 00:38:44.000
so in the covariance between two variables over time.
00:38:44.000 --> 00:38:46.000
All of which, the expected value and the variance
00:38:46.000 --> 00:38:49.000
and the correlations are just elements
00:38:49.000 --> 00:38:54.000
of the distribution of a random variable.
00:38:54.000 --> 00:38:59.000
But they do not capture the whole information embodied
00:38:59.000 --> 00:39:03.000
in the probability distribution.
00:39:03.000 --> 00:39:06.000
In some sense, this makes life easier for us.
00:39:06.000 --> 00:39:09.000
Because if we are just interested in these three
00:39:09.000 --> 00:39:11.000
things, for instance, and very often we
00:39:11.000 --> 00:39:15.000
are just interested in expected value of variances
00:39:15.000 --> 00:39:19.000
and correlations or covariances, then we
00:39:19.000 --> 00:39:24.000
can deal with the distribution of the random variable
00:39:24.000 --> 00:39:27.000
in a simplified way.
00:39:27.000 --> 00:39:31.000
And do not need to handle the whole distribution
00:39:31.000 --> 00:39:33.000
of the random variable.
00:39:33.000 --> 00:39:37.000
But just compute what we are interested in using
00:39:37.000 --> 00:39:42.000
certain features of the probability distribution.
00:39:42.000 --> 00:39:44.000
Now, while I think that you probably all
00:39:44.000 --> 00:39:48.000
know what an expected value is, for the sake of completeness,
00:39:48.000 --> 00:39:52.000
let me define the concept of an expected value
00:39:52.000 --> 00:39:55.000
in a formal way.
00:39:55.000 --> 00:39:57.000
The expected value of a random variable x
00:39:57.000 --> 00:40:01.000
is a weighted average of all possible values,
00:40:01.000 --> 00:40:04.000
which x may take, where the weights are
00:40:04.000 --> 00:40:08.000
determined by the PVF.
00:40:08.000 --> 00:40:11.000
So in the case of this already, the distribution
00:40:11.000 --> 00:40:14.000
is actually verbal distribution, verbal definition,
00:40:15.000 --> 00:40:19.000
that's already the definition, the complete definition.
00:40:19.000 --> 00:40:21.000
It's a verbal form of the definition.
00:40:21.000 --> 00:40:24.000
But we can also define this formally.
00:40:24.000 --> 00:40:27.000
And here we distinguish between the discrete case
00:40:27.000 --> 00:40:29.000
and the continuous case.
00:40:29.000 --> 00:40:35.000
So if x is a discrete random variable with k values, x1
00:40:35.000 --> 00:40:40.000
to xk, then the expected value of capital X
00:40:40.000 --> 00:40:45.000
is always the value which capital X can take.
00:40:45.000 --> 00:40:51.000
So in this case, x1, x2, and so forth, all the way up to xk,
00:40:51.000 --> 00:40:56.000
multiplied by the density, or since we are in the discrete
00:40:56.000 --> 00:41:02.000
case, by the probability of the random variable taking
00:41:02.000 --> 00:41:07.000
on the particular value x1 or x2 or xk.
00:41:07.000 --> 00:41:10.000
So it's just the sum of all possible outcomes
00:41:10.000 --> 00:41:14.000
of the random variable multiplied
00:41:14.000 --> 00:41:18.000
by the appropriate probability of this outcome.
00:41:18.000 --> 00:41:21.000
This gives us the expectation of random variable x
00:41:21.000 --> 00:41:23.000
if x is discrete.
00:41:23.000 --> 00:41:26.000
And then obviously, in the case where x is continuous,
00:41:26.000 --> 00:41:29.000
the same thing, the same idea is applied,
00:41:29.000 --> 00:41:35.000
just replacing the summation sign by an integral sign.
00:41:35.000 --> 00:41:40.000
So here we integrate over all possible values of x
00:41:40.000 --> 00:41:44.000
from minus infinity to plus infinity
00:41:44.000 --> 00:41:49.000
and integrate the product of x times the density of x.
00:41:49.000 --> 00:41:51.000
So to speak, intuitively, you may
00:41:51.000 --> 00:41:54.000
think of the probability of x being expressed
00:41:54.000 --> 00:41:55.000
by the density, even though that is not
00:41:55.000 --> 00:41:58.000
exact, as I pointed out, since the probability of x
00:41:58.000 --> 00:42:01.000
is always 0 and the density is not.
00:42:01.000 --> 00:42:06.000
But the density here replaces the concept of a probability.
00:42:06.000 --> 00:42:08.000
So this is x times f of x.
00:42:08.000 --> 00:42:10.000
And this is being summed over all x
00:42:10.000 --> 00:42:13.000
from minus infinity to plus infinity.
00:42:13.000 --> 00:42:16.000
That's an integral, which is the expected value
00:42:16.000 --> 00:42:18.000
of random variable x.
00:42:23.000 --> 00:42:25.000
We have used here a certain notation,
00:42:25.000 --> 00:42:31.000
namely e of something, so e and then a term in parentheses.
00:42:31.000 --> 00:42:35.000
This e has a special mathematical structure, which
00:42:35.000 --> 00:42:37.000
I will not elaborate on.
00:42:37.000 --> 00:42:40.000
It is a so-called linear operator.
00:42:40.000 --> 00:42:43.000
And you don't really need to know what a linear operator is.
00:42:43.000 --> 00:42:47.000
All you need to know is that you can use the e
00:42:47.000 --> 00:42:50.000
within linear transformations.
00:42:50.000 --> 00:42:56.000
For instance, if you have the expectation of some scalar
00:42:56.000 --> 00:43:00.000
a times x plus some scalar b times y,
00:43:00.000 --> 00:43:04.000
then this is the same thing as a times the expectation of x
00:43:04.000 --> 00:43:07.000
plus b times the expectation of y.
00:43:07.000 --> 00:43:10.000
And the reverse is, of course, also true.
00:43:10.000 --> 00:43:13.000
So if you have some expectation x and some expectation y
00:43:13.000 --> 00:43:16.000
and you add them, and perhaps you have certain weights
00:43:16.000 --> 00:43:20.000
by which you multiply e of x and e of y,
00:43:20.000 --> 00:43:23.000
then you can write this just one expectation.
00:43:23.000 --> 00:43:28.000
So the expectation operator E is a linear operator,
00:43:28.000 --> 00:43:32.000
which means that you can use it in all types of linear
00:43:32.000 --> 00:43:36.000
transformation and push it backwards or forwards
00:43:36.000 --> 00:43:40.000
in ways which you may find suitable.
00:43:40.000 --> 00:43:42.000
But that does not hold true anymore
00:43:42.000 --> 00:43:45.000
when there are nonlinear transformations,
00:43:45.000 --> 00:43:48.000
like for instance, multiplications or divisions
00:43:48.000 --> 00:43:52.000
or something like that.
00:43:52.000 --> 00:43:55.000
As a matter of convention, the expected value
00:43:55.000 --> 00:43:58.000
is often denoted as mu with an appropriate index.
00:43:58.000 --> 00:44:01.000
So for instance, for variable x, you would have the index x.
00:44:01.000 --> 00:44:04.000
Mu x often means, regularly actually,
00:44:04.000 --> 00:44:08.000
it's the expected value of x.
00:44:08.000 --> 00:44:11.000
And this expected value of this mu x
00:44:11.000 --> 00:44:16.000
is sometimes called the population mean.
00:44:16.000 --> 00:44:19.000
So it is the mean in the population
00:44:19.000 --> 00:44:25.000
of all possible realizations of the random variable x.
00:44:25.000 --> 00:44:27.000
You have to distinguish in statistics
00:44:27.000 --> 00:44:29.000
between the population mean, which
00:44:29.000 --> 00:44:33.000
is the true expected value of a random variable,
00:44:33.000 --> 00:44:36.000
and the sample mean, which is sometimes
00:44:36.000 --> 00:44:40.000
called also the arithmetic mean.
00:44:40.000 --> 00:44:43.000
So the sample mean is the mean which
00:44:43.000 --> 00:44:47.000
you compute in a particular sample drawn
00:44:47.000 --> 00:44:49.000
from the underlying population, drawn
00:44:49.000 --> 00:44:54.000
from the underlying distribution of the random variable x.
00:44:54.000 --> 00:44:57.000
So for instance, if you have n observations
00:44:57.000 --> 00:45:02.000
on a random variable, then you have a sample of size n.
00:45:02.000 --> 00:45:05.000
And you can, of course, compute the sample mean.
00:45:05.000 --> 00:45:09.000
But the sample mean is not the expected value,
00:45:09.000 --> 00:45:11.000
not the population mean.
00:45:11.000 --> 00:45:15.000
It's not the expected value of random variable x,
00:45:15.000 --> 00:45:17.000
but it's just the sample.
00:45:17.000 --> 00:45:20.000
For instance, if you have a random variable which
00:45:20.000 --> 00:45:25.000
has an expected value of 0, say a standard normal distribution,
00:45:25.000 --> 00:45:31.000
and you draw n times from the standard normal distribution,
00:45:31.000 --> 00:45:35.000
so that you have a sample of size n observations drawn
00:45:35.000 --> 00:45:38.000
independently from the same distribution
00:45:38.000 --> 00:45:42.000
where the population mean, the expected value is 0,
00:45:42.000 --> 00:45:48.000
then you can be sure that the sample mean is not 0.
00:45:48.000 --> 00:45:50.000
Actually, the probability of the sample mean,
00:45:50.000 --> 00:45:52.000
regardless of what the sample size is,
00:45:52.000 --> 00:45:55.000
as long as it is finite, the probability
00:45:55.000 --> 00:46:00.000
of the sample mean being 0 for a variable with expected value 0,
00:46:00.000 --> 00:46:02.000
this probability is always 0.
00:46:02.000 --> 00:46:04.000
This never happens.
00:46:04.000 --> 00:46:06.000
The sample mean may be quite small.
00:46:06.000 --> 00:46:11.000
It may be close to 0 with increasing number of observations.
00:46:11.000 --> 00:46:13.000
Actually, we have good reasons.
00:46:13.000 --> 00:46:15.000
And that's actually a mathematical theorem, law
00:46:15.000 --> 00:46:19.000
of large number, which implies that the sample mean becomes
00:46:19.000 --> 00:46:23.000
smaller and smaller, the larger the sample is.
00:46:23.000 --> 00:46:26.000
But you never have a sample mean of 0,
00:46:26.000 --> 00:46:30.000
even though the true mean of the random variables
00:46:30.000 --> 00:46:36.000
or the population mean, its expected value, is 0.
00:46:36.000 --> 00:46:39.000
Therefore, it is important to distinguish
00:46:39.000 --> 00:46:43.000
between population mean and sample mean.
00:46:43.000 --> 00:46:46.000
And this, what I have just said, not only refers to the mean,
00:46:46.000 --> 00:46:50.000
but it also refers to other characteristics
00:46:50.000 --> 00:46:52.000
of a distribution.
00:46:52.000 --> 00:46:56.000
For instance, to the variance of a standard deviation.
00:46:56.000 --> 00:46:59.000
You may have, say, a standard normal distribution, which,
00:46:59.000 --> 00:47:03.000
as you know, has a variance of 1.
00:47:03.000 --> 00:47:09.000
Then you may compute the variance within your sample.
00:47:09.000 --> 00:47:12.000
You'll find that the variance in your sample is never 1,
00:47:12.000 --> 00:47:15.000
never exactly 1.
00:47:15.000 --> 00:47:18.000
But if the sample is large enough,
00:47:18.000 --> 00:47:22.000
you'll find values for the sample variants, which
00:47:22.000 --> 00:47:24.000
are probably close to 1.
00:47:24.000 --> 00:47:26.000
But they will never be exactly 1,
00:47:26.000 --> 00:47:31.000
because this event has probability 0.
00:47:31.000 --> 00:47:33.000
If you know how to generate random numbers on your computer
00:47:33.000 --> 00:47:37.000
with Excel or with MATLAB or some other mathematical
00:47:37.000 --> 00:47:40.000
programming software, just give it a try.
00:47:40.000 --> 00:47:44.000
Generate some random numbers from a random,
00:47:44.000 --> 00:47:47.000
from a standard normal distribution.
00:47:47.000 --> 00:47:48.000
That's easy to do.
00:47:48.000 --> 00:47:52.000
And compute the sample mean and compute the sample variance.
00:47:52.000 --> 00:47:56.000
And you'll see that the sample mean is not 0,
00:47:56.000 --> 00:47:59.000
and that the sample variance is not 0.
00:47:59.000 --> 00:48:01.000
Even though the underlying distribution,
00:48:01.000 --> 00:48:06.000
the population moments, how they are called, are 0 and 1.
00:48:06.000 --> 00:48:11.000
Sorry, the sample variance should be 1.
00:48:11.000 --> 00:48:14.000
But it will not be exactly 1, and the sample mean
00:48:14.000 --> 00:48:17.000
should be 0, but it will not be exactly 0.
00:48:17.000 --> 00:48:21.000
Hopefully, sample mean will be close to 0,
00:48:21.000 --> 00:48:24.000
and the sample variance or standard deviation
00:48:24.000 --> 00:48:25.000
will be close to 1.
00:48:25.000 --> 00:48:28.000
But they will never be exactly where
00:48:28.000 --> 00:48:32.000
their theoretical moment lies, or where this population mean
00:48:32.000 --> 00:48:34.000
or the population variance lies.
00:48:37.000 --> 00:48:40.000
Take a coin tossing experiment again.
00:48:40.000 --> 00:48:42.000
The same coin tossing experiment which we already
00:48:42.000 --> 00:48:45.000
had in the last lecture.
00:48:45.000 --> 00:48:48.000
And recall that the PDF of the coin tossing experiment
00:48:48.000 --> 00:48:52.000
was of this particular form here.
00:48:52.000 --> 00:48:58.000
So we basically had 1, 2, 3, 4, 5 values,
00:48:58.000 --> 00:49:01.000
which the PDF may take on.
00:49:01.000 --> 00:49:09.000
Four of them are positive for the four possible events which
00:49:09.000 --> 00:49:11.000
can happen in the coin tossing experiment.
00:49:11.000 --> 00:49:14.000
Remember, we counted the number of heads
00:49:14.000 --> 00:49:17.000
after throwing the coin three times.
00:49:17.000 --> 00:49:20.000
So they can be either 0 head or 1 head or 2 head or 3 heads.
00:49:20.000 --> 00:49:23.000
Four different outcomes, they have different probabilities,
00:49:23.000 --> 00:49:28.000
and any other outcome has just probability 0.
00:49:28.000 --> 00:49:33.000
So the expected value that's not very easy is 0
00:49:33.000 --> 00:49:35.000
is one possible outcome multiplied
00:49:35.000 --> 00:49:40.000
by the probability of this outcome, which is 0.125.
00:49:40.000 --> 00:49:44.000
Possible outcome is one with a probability of 0.375.
00:49:44.000 --> 00:49:47.000
So take the product of the two.
00:49:47.000 --> 00:49:54.000
The possible event that there are two heads in tossing the coin
00:49:54.000 --> 00:49:59.000
has also probability of 0.375.
00:49:59.000 --> 00:50:02.000
So you multiply again the event with the probability
00:50:02.000 --> 00:50:03.000
of this event.
00:50:03.000 --> 00:50:05.000
And then you multiply finally the event
00:50:05.000 --> 00:50:09.000
that you toss three heads in a row.
00:50:09.000 --> 00:50:13.000
When tossing three times, well, this may also happen.
00:50:13.000 --> 00:50:16.000
It has a probability of 12.5%.
00:50:16.000 --> 00:50:23.000
So 0.125 again, multiply 3 by this probability and compute it.
00:50:23.000 --> 00:50:27.000
And you would see that the expected value of your coin
00:50:27.000 --> 00:50:32.000
tossing experiment is 1.5, which is not so surprising
00:50:32.000 --> 00:50:37.000
because it's just sort of in the middle of the probability
00:50:37.000 --> 00:50:38.000
distribution.
00:50:38.000 --> 00:50:40.000
I see the distribution is symmetric.
00:50:40.000 --> 00:50:44.000
The extreme events 0 and 3 have probability 0.125 each.
00:50:44.000 --> 00:50:47.000
And the other two events have a much greater probability
00:50:47.000 --> 00:50:50.000
of 0.375 each.
00:50:50.000 --> 00:50:54.000
So it's not so surprising that basically the mean
00:50:54.000 --> 00:50:57.000
of these four numbers here, 1.5, is also
00:50:57.000 --> 00:51:01.000
the expected value of this coin tossing experiment,
00:51:01.000 --> 00:51:02.000
random variable.
00:51:02.000 --> 00:51:11.000
You also know what a standard deviation or what a variance is,
00:51:11.000 --> 00:51:11.000
I suppose.
00:51:11.000 --> 00:51:13.000
But I repeated with the same argument
00:51:13.000 --> 00:51:17.000
that I have repeated what the expected value is.
00:51:17.000 --> 00:51:19.000
Both variance and standard deviation
00:51:19.000 --> 00:51:26.000
measure the variability of the outcomes of a random variable.
00:51:26.000 --> 00:51:28.000
The difference just being that the standard deviation
00:51:28.000 --> 00:51:30.000
is the square root of the variance.
00:51:30.000 --> 00:51:33.000
So actually, variance and standard deviation
00:51:33.000 --> 00:51:37.000
contain exactly the same information.
00:51:37.000 --> 00:51:41.000
They are completely equivalent to each other.
00:51:41.000 --> 00:51:43.000
The standard deviation is always positive.
00:51:43.000 --> 00:51:46.000
The variance is always positive with the one special case
00:51:46.000 --> 00:51:49.000
that variance and standard deviation are exactly 0.
00:51:49.000 --> 00:51:51.000
But in this case, we wouldn't really
00:51:51.000 --> 00:51:55.000
have a true random variable, but a degenerate case
00:51:55.000 --> 00:51:56.000
of a random variable.
00:51:56.000 --> 00:52:00.000
So in almost all instances, we will consider,
00:52:00.000 --> 00:52:02.000
we can safely assume that variance and standard
00:52:02.000 --> 00:52:04.000
deviation are positive.
00:52:04.000 --> 00:52:07.000
Just take the square root of the variance
00:52:07.000 --> 00:52:10.000
to get the standard deviation or square the standard deviation
00:52:10.000 --> 00:52:12.000
to compute the variance.
00:52:12.000 --> 00:52:15.000
Usually, you report only one of the two,
00:52:15.000 --> 00:52:17.000
either the variance or the standard deviation
00:52:17.000 --> 00:52:20.000
because the information of the other
00:52:20.000 --> 00:52:23.000
is then included in the information of the first.
00:52:26.000 --> 00:52:30.000
You probably also know how two variables look or how
00:52:30.000 --> 00:52:33.000
their PDFs look if they have the same mean
00:52:33.000 --> 00:52:37.000
and different variability.
00:52:37.000 --> 00:52:41.000
But I showed this to you again here for completeness
00:52:41.000 --> 00:52:45.000
by showing you a figure from Woolridge textbook again.
00:52:45.000 --> 00:52:54.000
Here we have two variables, x with PDF f of x like this.
00:52:54.000 --> 00:53:02.000
And we have then another random variable y with PDF f of y.
00:53:02.000 --> 00:53:05.000
And can you tell me just, let's see
00:53:05.000 --> 00:53:08.000
if it works with raising the hand.
00:53:08.000 --> 00:53:11.000
Are these the PDFs of discrete random variables
00:53:11.000 --> 00:53:13.000
or continuous random variables?
00:53:13.000 --> 00:53:25.000
Anybody knows this?
00:53:32.000 --> 00:53:35.000
OK, to raise their hand.
00:53:35.000 --> 00:53:36.000
Very good.
00:53:36.000 --> 00:53:37.000
Perhaps one of you can, or both of you,
00:53:37.000 --> 00:53:42.000
can just send in an answer, continuous or discrete.
00:53:52.000 --> 00:53:54.000
Don't see anything?
00:53:54.000 --> 00:53:55.000
Check three.
00:53:55.000 --> 00:53:57.000
Why don't I see this on the chat?
00:53:57.000 --> 00:53:57.000
Let's see.
00:54:02.000 --> 00:54:04.000
Yeah, both of them say continuous, correct?
00:54:04.000 --> 00:54:09.000
That's continuous because there are no jobs here.
00:54:09.000 --> 00:54:15.000
I mean, in the discrete case, we would have some probability mass
00:54:15.000 --> 00:54:17.000
at selected points.
00:54:17.000 --> 00:54:20.000
And we would have zeros between these selected points.
00:54:20.000 --> 00:54:21.000
But here we have smooth functions.
00:54:21.000 --> 00:54:25.000
So obviously, these are continuous random variables
00:54:25.000 --> 00:54:29.000
underlying these PDFs here.
00:54:29.000 --> 00:54:33.000
Note that I have, or Woolridge has illustrated this here
00:54:33.000 --> 00:54:36.000
with the case of two random variables which
00:54:36.000 --> 00:54:38.000
are centered on the same mean.
00:54:38.000 --> 00:54:44.000
So the same population mean mu for both x and y.
00:54:44.000 --> 00:54:48.000
And also, he has used symmetric distribution
00:54:48.000 --> 00:54:49.000
for both random variables.
00:54:49.000 --> 00:54:54.000
Obviously, random variables need not have symmetric distributions.
00:54:54.000 --> 00:55:01.000
But it is easier to understand what variants or standard deviation
00:55:01.000 --> 00:55:05.000
expresses when we have symmetric distributions.
00:55:05.000 --> 00:55:10.000
Here, you can easily see that the variability of the x random variable
00:55:10.000 --> 00:55:14.000
is less than the variability of the y variable
00:55:14.000 --> 00:55:19.000
because the probability of high values for y
00:55:19.000 --> 00:55:21.000
is greater than for x.
00:55:21.000 --> 00:55:25.000
Essentially, for x, we hardly observe any values
00:55:25.000 --> 00:55:27.000
greater than this particular value, which
00:55:27.000 --> 00:55:30.000
I here mark with my little hand.
00:55:31.000 --> 00:55:35.000
There's still a positive density for y.
00:55:35.000 --> 00:55:41.000
So it may well be that there are more extreme observations on y
00:55:41.000 --> 00:55:45.000
than this particular value for which we can assume
00:55:45.000 --> 00:55:48.000
that there are almost no observations which
00:55:48.000 --> 00:55:50.000
are more extreme than this particular value
00:55:50.000 --> 00:55:56.000
under the random variable x.
00:55:56.000 --> 00:56:00.000
So the variability of random variable x
00:56:00.000 --> 00:56:04.000
is clearly less than the variability of random variable y.
00:56:08.000 --> 00:56:10.000
Now we define the variance.
00:56:10.000 --> 00:56:16.000
Formally, for any random variable x, denote its expected value by mu x.
00:56:16.000 --> 00:56:19.000
So that's the expectation of x.
00:56:19.000 --> 00:56:23.000
Then we denote the variance by v of x.
00:56:23.000 --> 00:56:31.000
And this is simply defined as the expectation of x minus its expected
00:56:31.000 --> 00:56:34.000
values, value squared.
00:56:34.000 --> 00:56:38.000
So you take the difference of x and its expected values.
00:56:38.000 --> 00:56:45.000
Basically, you take the deviation of x from its expected value,
00:56:45.000 --> 00:56:48.000
and you square this deviation, and then you
00:56:48.000 --> 00:56:54.000
take the expected value of the squared deviation.
00:56:54.000 --> 00:56:58.000
Probably all of you know why we square the deviation here.
00:56:58.000 --> 00:57:01.000
If we did not do that, then taking the expectation
00:57:01.000 --> 00:57:05.000
would, of course, result in a 0, because positive deviations
00:57:05.000 --> 00:57:08.000
would be matched by negative deviations, at least
00:57:08.000 --> 00:57:10.000
in the case of symmetric random variables.
00:57:10.000 --> 00:57:12.000
So that doesn't make any sense.
00:57:12.000 --> 00:57:17.000
We want to have a measure which is always positive for the deviation,
00:57:17.000 --> 00:57:23.000
regardless of whether x deviates from its mean or positive or negative way.
00:57:23.000 --> 00:57:26.000
This is why we squared here.
00:57:26.000 --> 00:57:31.000
And we call that the expectation of the squared deviation,
00:57:31.000 --> 00:57:37.000
of the squared mean deviation, of the mean squared deviation.
00:57:37.000 --> 00:57:40.000
This we call the variance v of x.
00:57:40.000 --> 00:57:43.000
And that is often denoted as sigma squared,
00:57:43.000 --> 00:57:46.000
with index referring to the respective random variables,
00:57:46.000 --> 00:57:50.000
or sigma squared x in this case, is equal to v of x.
00:57:53.000 --> 00:57:56.000
Now, you probably also know that v of x can
00:57:56.000 --> 00:58:01.000
be written in a slightly easier form than this form here.
00:58:01.000 --> 00:58:07.000
Namely, v of x is also equal to the expectation of x squared minus
00:58:07.000 --> 00:58:10.000
the expectation of x squared.
00:58:10.000 --> 00:58:14.000
So the expectation of the squared random variable
00:58:14.000 --> 00:58:20.000
minus the expectation of the random variable squared is the variance.
00:58:20.000 --> 00:58:25.000
As you can easily see from these few steps here,
00:58:25.000 --> 00:58:31.000
the variance is defined to be the expectation of the squared deviation
00:58:31.000 --> 00:58:33.000
between random variable and its mean.
00:58:34.000 --> 00:58:38.000
So we can multiply the square here out.
00:58:38.000 --> 00:58:43.000
This gives us x squared minus 2x mu of x,
00:58:43.000 --> 00:58:46.000
so 2 times x times the expectation of x,
00:58:47.000 --> 00:58:51.000
plus the expectation of x squared, so mu x squared.
00:58:53.000 --> 00:58:56.000
Which is then, since e is a linear operator,
00:58:56.000 --> 00:58:58.000
the expectation of x squared.
00:58:59.000 --> 00:59:01.000
2 is just a real number.
00:59:01.000 --> 00:59:04.000
So we can pull this out of the expectation operator.
00:59:04.000 --> 00:59:06.000
Mu x is also just a real number.
00:59:06.000 --> 00:59:07.000
It's not a random variable.
00:59:07.000 --> 00:59:10.000
We can also pull this out of the expectation operator.
00:59:10.000 --> 00:59:15.000
So it's minus 2 times the expectation of x times mu x,
00:59:15.000 --> 00:59:19.000
and obviously the expectation of x is just equal to mu x.
00:59:21.000 --> 00:59:25.000
And here now I see that something happened.
00:59:27.000 --> 00:59:33.000
I have some trouble currently with converting my files from VRT to PDF,
00:59:33.000 --> 00:59:38.000
and there is a, not a parenthesis, what is it?
00:59:41.000 --> 00:59:44.000
There's this parenthesis like symbol missing here,
00:59:44.000 --> 00:59:47.000
which tells you that e of x is just equal to mu of x.
00:59:47.000 --> 00:59:50.000
I will have to correct this later on.
00:59:50.000 --> 00:59:50.000
Sorry for that.
00:59:51.000 --> 00:59:53.000
It was actually in the previous set of slides,
00:59:53.000 --> 00:59:55.000
but now it disappeared again.
00:59:55.000 --> 01:00:00.000
So e of x is equal to mu of x plus mu of x squared,
01:00:01.000 --> 01:00:05.000
because we have to take the expectation of mu of x squared.
01:00:05.000 --> 01:00:07.000
And since this is a real number,
01:00:08.000 --> 01:00:12.000
we know that the expectation is just equal to mu x squared.
01:00:12.000 --> 01:00:19.000
So we see we subtract mu x squared two times and add it once,
01:00:20.000 --> 01:00:23.000
which gives us minus mu x squared.
01:00:23.000 --> 01:00:28.000
And then we have left over e of x squared minus e of x squared.
01:00:28.000 --> 01:00:31.000
As an expression for the variance.
01:00:37.000 --> 01:00:42.000
The variance of a, I should perhaps better write a non-degenerate random variable is always positive.
01:00:42.000 --> 01:00:47.000
There are certain extreme cases of random variables,
01:00:47.000 --> 01:00:53.000
which may have a zero variance, but we won't cover them in this lecture.
01:00:53.000 --> 01:00:54.000
So don't bother about that.
01:00:54.000 --> 01:01:00.000
You can safely assume that for all the random variables in this lecture,
01:01:00.000 --> 01:01:04.000
the variance of a random variable is always positive,
01:01:04.000 --> 01:01:09.000
because a square is always either zero or positive.
01:01:09.000 --> 01:01:14.000
In the case of zero deviations, we just exclude here and don't consider.
01:01:17.000 --> 01:01:24.000
A constant, so just a real number, may be viewed as a degenerate random variable.
01:01:24.000 --> 01:01:29.000
So we may think of a constant as a random variable without any variation.
01:01:32.000 --> 01:01:36.000
The variance of such a constant is then, of course, zero.
01:01:37.000 --> 01:01:40.000
Apparently, this is another thing which is missing here in the slides.
01:01:40.000 --> 01:01:43.000
I want to show this formally to you, but it's not really important.
01:01:43.000 --> 01:01:52.000
So we just skip it and rather write or note that when we have a constant,
01:01:53.000 --> 01:02:02.000
if and only if there is a constant C such that the probability of X is equal to C is equal to 1,
01:02:02.000 --> 01:02:05.000
we have V of X equal to zero.
01:02:05.000 --> 01:02:09.000
So this is the degenerate case of a random variable which has variance zero.
01:02:11.000 --> 01:02:20.000
This is possible for a random variable if the random variable takes on numbers which are
01:02:22.000 --> 01:02:32.000
different from zero, only on a set of measure zero if you know what set of measure zero is.
01:02:32.000 --> 01:02:34.000
But if you don't, then don't bother about it as I said.
01:02:34.000 --> 01:02:40.000
This is a particular case which is not relevant for anything which follows in this lecture.
01:02:44.000 --> 01:02:48.000
The standard deviation then, as you know and as I have already said twice,
01:02:48.000 --> 01:02:51.000
is just the square root of the random variable.
01:02:51.000 --> 01:02:54.000
And we do not understand that the deviation by sigma index X
01:02:55.000 --> 01:02:58.000
divided by the square root of the variance of X.
01:03:01.000 --> 01:03:05.000
Now let's collect all this in our coin tossing experiment example.
01:03:05.000 --> 01:03:13.000
We had already computed that the expected value of X is equal to 1.5.
01:03:13.000 --> 01:03:16.000
So mu X is equal to 1.5.
01:03:16.000 --> 01:03:21.000
So now we can set up a little table here.
01:03:21.000 --> 01:03:25.000
We have the four possible outcomes for the coin tossing experiments,
01:03:26.000 --> 01:03:31.000
both for the coin tossing experiment, which were zero heads, one head, two heads,
01:03:31.000 --> 01:03:35.000
or three heads, and three tosses of the coin.
01:03:35.000 --> 01:03:38.000
And we know the associated probabilities for these events,
01:03:39.000 --> 01:03:41.000
which you find this line here.
01:03:41.000 --> 01:03:43.000
I don't have to read them again.
01:03:43.000 --> 01:03:45.000
You recognize them from what I've said earlier.
01:03:46.000 --> 01:03:53.000
And now we can compute what is Xj minus mu X in each of these four cases.
01:03:53.000 --> 01:04:00.000
Obviously Xj minus mu X in the case that Xj is zero is zero minus 1.5.
01:04:00.000 --> 01:04:01.000
So it's minus 1.5.
01:04:03.000 --> 01:04:08.000
And here when Xj assumes the value one, it would be one minus 1.5.
01:04:09.000 --> 01:04:12.000
So Xj minus mu X is minus 0.5.
01:04:13.000 --> 01:04:15.000
And here it would be two minus 1.5.
01:04:15.000 --> 01:04:16.000
So it's 0.5.
01:04:17.000 --> 01:04:19.000
And here it would be three minus 1.5.
01:04:19.000 --> 01:04:20.000
So it's 1.5.
01:04:22.000 --> 01:04:26.000
What we have done here is that we have centered the random variable by just
01:04:26.000 --> 01:04:30.000
subtracting its mean from all possible outcomes.
01:04:31.000 --> 01:04:36.000
So we take the square of this deviation from the population mean.
01:04:36.000 --> 01:04:42.000
So Xj minus mu X squared would be in this case, minus 1.5.
01:04:42.000 --> 01:04:43.000
Square is 2.25.
01:04:44.000 --> 01:04:47.000
Minus 0.5 squared is 0.25.
01:04:49.000 --> 01:04:52.000
Plus 0.5 squared is plus 2.5.
01:04:53.000 --> 01:04:57.000
And 1.5 squared is of course 2.25.
01:04:57.000 --> 01:04:58.000
So again, this is symmetric here.
01:04:58.000 --> 01:05:04.000
We just have positive numbers now because we have squared the initial deviations.
01:05:05.000 --> 01:05:10.000
And if we now multiply the square deviations with the probabilities,
01:05:10.000 --> 01:05:16.000
and we compute this expression here, Pj, the probability of the event Xj,
01:05:17.000 --> 01:05:23.000
the probability of the event Xj multiplied by Xj minus mu X squared is then,
01:05:23.000 --> 01:05:26.000
well, whatever you find in this line here.
01:05:28.000 --> 01:05:38.000
Therefore, the variance of X or the sigma square X is just the sum of all these terms here.
01:05:38.000 --> 01:05:41.000
And if you add them up, you'll see that this is 0.75.
01:05:43.000 --> 01:05:48.000
For which it follows that the standard deviation is the square root of that.
01:05:48.000 --> 01:05:57.000
This is 1.5 of the square root of 3, as you can easily see since 0.75 is 3 over 4.
01:05:57.000 --> 01:06:00.000
So the square root of 4 is 2, and then take the square root of 3.
01:06:00.000 --> 01:06:10.000
Check that what I showed you here,
01:06:10.000 --> 01:06:16.000
V of X is equal to E of X squared minus E of X squared, that this is true.
01:06:17.000 --> 01:06:21.000
So we take the expectation of the squared random variable
01:06:22.000 --> 01:06:27.000
minus the squared value of the expectation of the random variable.
01:06:27.000 --> 01:06:30.000
Well, what is the expectation of the squared random variable?
01:06:30.000 --> 01:06:35.000
This is, of course, well, 2 square 0 doesn't make any sense.
01:06:35.000 --> 01:06:40.000
This is 0, and times the probability it remains 0, so we can forget the first column here.
01:06:41.000 --> 01:06:45.000
Squaring 1 gives 1, so we have to multiply this by 0.375.
01:06:46.000 --> 01:06:49.000
This is the first term here, 0.375 times 1.
01:06:51.000 --> 01:06:53.000
Squaring 2 gives 4.
01:06:53.000 --> 01:06:58.000
We have to multiply 4 by the associated probability, which is also 0.375.
01:06:59.000 --> 01:07:02.000
And squaring 3 gives 9.
01:07:02.000 --> 01:07:09.000
And we have to multiply the 9 with the associated probability, 0.125 here.
01:07:10.000 --> 01:07:17.000
And then we subtract the squared expected value, so 1.5 squared.
01:07:17.000 --> 01:07:20.000
If you compute this, you'll see that this is just 0.75.
01:07:20.000 --> 01:07:32.000
Now, we had the property of the linear operator for the expected value.
01:07:33.000 --> 01:07:39.000
And therefore, we may also ask if there is an analogous property for the variance.
01:07:40.000 --> 01:07:48.000
And indeed, there is, as I will show you now, if X is a random variable and A and B are constants,
01:07:48.000 --> 01:07:54.000
such that we have a new random variable Z defined as A plus B times X,
01:07:55.000 --> 01:08:01.000
then the variance of this new random variable Z is just B squared times the variance of X.
01:08:02.000 --> 01:08:06.000
So the A here drops out.
01:08:06.000 --> 01:08:12.000
The A does not affect the variance of Z because this is just a constant added to Z.
01:08:12.000 --> 01:08:15.000
It does not increase the variability of Z.
01:08:15.000 --> 01:08:23.000
But B, of course, increases or decreases the variability of X when B is being multiplied
01:08:23.000 --> 01:08:28.000
by X, depending on whether B is greater than 1 or smaller than 1.
01:08:29.000 --> 01:08:35.000
Since the variance is the expectation of the squared variable minus its mean,
01:08:37.000 --> 01:08:44.000
the B enters the formula here with a square so that the variance of Z is actually B squared
01:08:45.000 --> 01:08:50.000
times the variance of X. And here's the proof of this little property.
01:08:51.000 --> 01:09:00.000
The variance of Z is by definition equal to the expectation of Z minus E of Z squared.
01:09:03.000 --> 01:09:10.000
I put the square here outside the last parentheses. I could, of course, have spent another pair of
01:09:10.000 --> 01:09:17.000
parentheses to make sure that the expectation or that you see that the expectation refers to
01:09:17.000 --> 01:09:24.000
the squared term here. That does not mean that we square the expectation, which perhaps you could
01:09:24.000 --> 01:09:31.000
also think that this notation means squaring the expectation. I will often do this that I do not
01:09:31.000 --> 01:09:39.000
set this next pair of parentheses. But you just please remember that when there is a square back
01:09:39.000 --> 01:09:46.000
here, it means I take the expectation of a squared random variable. Whereas if I want
01:09:46.000 --> 01:09:53.000
to square the expectation, then I would write E squared of this thing here. So I would put
01:09:53.000 --> 01:09:59.000
the superscript with a 2 up here, the square up here, to indicate that this is a squared expectation.
01:10:00.000 --> 01:10:08.000
So this here means taking the expectation of a squared deviation, namely the deviation of
01:10:08.000 --> 01:10:15.000
Z from its expected value. And Z is a plus bX by definition. And the expected value
01:10:16.000 --> 01:10:23.000
is also taken from Z. So expectation of a plus bX and the thing squared.
01:10:25.000 --> 01:10:33.000
Well, the expectation of a plus b minus X is, of course, just a, because this is a constant,
01:10:33.000 --> 01:10:39.000
plus b minus the expectation of X, since the expectation operator is a linear operator.
01:10:39.000 --> 01:10:44.000
So we can write here minus a minus b times the expectation of X.
01:10:46.000 --> 01:10:52.000
And here we can just drop the parentheses. a plus bX is not changed. So we have a
01:10:53.000 --> 01:11:00.000
minus a, this drops out, and then we have b times X minus b times the expectation of X,
01:11:01.000 --> 01:11:09.000
which we are at now here. Take the expectation of b times X minus b times the expectation of X
01:11:09.000 --> 01:11:17.000
squared. But b is a common factor now to X and the expectation of X. So we can pull this common
01:11:17.000 --> 01:11:24.000
factor out of the square here. And this means that we have to square the b when we put it in front of
01:11:24.000 --> 01:11:30.000
the expectation operator. Therefore, we get this is b squared times the expectation of X minus E
01:11:30.000 --> 01:11:36.000
of X squared. And the expectation of X minus E of X squared is, of course, the variance of X.
01:11:37.000 --> 01:11:42.000
So we arrive at the result that the variance of Z is equal to b squared times the variance of X,
01:11:42.000 --> 01:11:47.000
which was exactly what I have stated here. So the proof is completed.
01:11:50.000 --> 01:11:57.000
Clearly, this implies that the standard deviation, sigma of Z, which is the square root of the
01:11:57.000 --> 01:12:08.000
variance of Z, is equal to the absolute value of b times sigma X. Be careful here, because this is
01:12:08.000 --> 01:12:16.000
sort of a trap in which you may easily step in if you are inexperienced. Do not write that the
01:12:16.000 --> 01:12:24.000
standard deviation of Z is b times sigma X. That would be wrong because b could be a negative number.
01:12:25.000 --> 01:12:31.000
We have not restricted b to be a positive number here. Z can very easily be a negative transform
01:12:31.000 --> 01:12:37.000
of X. And this doesn't play a role as long as we square the X, whether b is positive or negative,
01:12:37.000 --> 01:12:44.000
because the square is always positive. But here it would play a role. If you erroneously wrote that
01:12:44.000 --> 01:12:51.000
sigma Z is equal to b sigma X, b times sigma X, then this would imply that the standard deviation
01:12:51.000 --> 01:12:56.000
of the variable Z is negative in case b is negative. And that is, of course, wrong because
01:12:56.000 --> 01:13:01.000
the standard deviation is always positive. And the square root is always defined as the positive,
01:13:02.000 --> 01:13:09.000
as something positive. And so you need to write here, or if you program something, if you program
01:13:09.000 --> 01:13:16.000
a computer routine, be careful to really write the absolute value of b here and not b times sigma X.
01:13:16.000 --> 01:13:23.000
So in the case of the standard deviation, otherwise, however, it is clear that we just
01:13:23.000 --> 01:13:29.000
multiply the spread of X by the size of b in order to get the spread of Z offset.
01:13:33.000 --> 01:13:41.000
We proceed to define what a covariance is. And here, again, we consider two random variables,
01:13:41.000 --> 01:13:46.000
like we've done when we considered a joint distribution. For any pair of random variables,
01:13:46.000 --> 01:13:53.000
X and Y with expected values mu X and mu Y, the covariance, which I denote as
01:13:53.000 --> 01:14:03.000
cove of X and Y, is defined as the expectation of the product of the deviations of the two
01:14:03.000 --> 01:14:12.000
random variables from their expected value. So we multiply X minus mu X with Y minus mu Y.
01:14:12.000 --> 01:14:17.000
So these are two deviations multiplied with each other. And then we take the expectation
01:14:17.000 --> 01:14:26.000
of that product here. The covariance is a very important property in econometrics,
01:14:26.000 --> 01:14:32.000
because the covariance measures the degree by which two variables
01:14:33.000 --> 01:14:42.000
covariate by the degree by which they are related. So it is a measure for the strength
01:14:42.000 --> 01:14:53.000
of any mostly linear association between X and Y. Clearly, the variance of X is a special case of
01:14:53.000 --> 01:15:00.000
a covariance. So the covariance just generalizes the idea of a variance, because if you compute
01:15:00.000 --> 01:15:08.000
the covariance of X with itself, then you just get the variance. From this definition here,
01:15:08.000 --> 01:15:16.000
if you replace the Ys here by Xs, then we have X minus mu X times X minus mu X, which is X minus
01:15:16.000 --> 01:15:23.000
mu X parentheses squared. And that taking the expectation of that square is just the variance.
01:15:24.000 --> 01:15:28.000
So the variance is simply a special case of the covariance.
01:15:30.000 --> 01:15:38.000
And therefore, it is not surprising that a similar property holds for the covariance,
01:15:38.000 --> 01:15:45.000
as we have established it already for the variance. Namely, we can simplify the computation of the
01:15:45.000 --> 01:15:52.000
covariance. We do not need to take the expectation of this product here, which would involve an
01:15:52.000 --> 01:15:59.000
expectation of four different terms. Rather, we can also write the covariance is the expectation of
01:15:59.000 --> 01:16:07.000
X times Y minus the product of the expected values of X and Y. And I could have given that
01:16:07.000 --> 01:16:11.000
to you as a first exercise, but here I still go through the details, even though I don't doubt
01:16:11.000 --> 01:16:17.000
that you have seen that in statistics one or two. If you take the definition of the covariance in
01:16:17.000 --> 01:16:26.000
this form here, and then you just multiply out the two terms here, the two terms in parentheses,
01:16:26.000 --> 01:16:36.000
then you get that this is the expectation of X times Y minus mu X times Y minus mu Y times X
01:16:36.000 --> 01:16:45.000
plus the product of the two expected values mu X and mu Y. And since mu X is just a number and mu
01:16:45.000 --> 01:16:53.000
Y is just a number, we can then apply the expectations operator component wise to this
01:16:53.000 --> 01:17:01.000
expression and we get this is E of the expected value of the product X and Y, which we have here.
01:17:02.000 --> 01:17:09.000
And then the expectation of a constant mu X times Y. So we can pull the constant out of the
01:17:09.000 --> 01:17:16.000
expectation and this gives mu X times the expectation of Y. Similar here, mu Y times
01:17:16.000 --> 01:17:24.000
the expectation of X plus mu X mu Y. Now, obviously, E of Y is just the expectation of Y
01:17:25.000 --> 01:17:35.000
and E of X, sorry, E of Y is just mu Y and E of X is just mu X. So we have minus mu X mu Y
01:17:36.000 --> 01:17:42.000
minus mu Y mu X plus mu X mu Y, this simplifies to minus mu X mu Y.
01:17:42.000 --> 01:17:49.000
So this is the expectation of X and Y minus the product of the two expectations, as I have
01:17:49.000 --> 01:17:59.000
claimed. Is there a similar property for the variance? Yes, of course, we have discussed
01:17:59.000 --> 01:18:06.000
that already since the variance is just a special case of the covariance. So this property we know
01:18:06.000 --> 01:18:14.000
already, but we may be interested also in knowing what is the variance of X plus Y if we add two
01:18:14.000 --> 01:18:22.000
random variables. Well, then we can compute this using the concept of covariance. That is the
01:18:22.000 --> 01:18:35.000
expectation of X plus Y squared minus the expected values squared using the same rule that we have
01:18:35.000 --> 01:18:43.000
established previously. So the expectation of X plus Y squared minus the square of the two
01:18:43.000 --> 01:18:51.000
expectations, the sum of the two expectations. X plus Y squared is X squared plus two X Y plus
01:18:51.000 --> 01:18:58.000
Y squared, and we have to take the expectation of this minus, now we can just multiply this thing
01:18:58.000 --> 01:19:08.000
here as minus mu X squared minus two mu X mu Y minus mu Y squared. So this here is just minus
01:19:09.000 --> 01:19:16.000
this term in parentheses squared. Well, then we see, of course, here is E of X squared
01:19:18.000 --> 01:19:24.000
minus E of X squared minus the expected value squared. So this is the variance of X.
01:19:24.000 --> 01:19:32.000
Then we have the expectation of Y squared minus the expectation, minus the squared value of the
01:19:32.000 --> 01:19:40.000
expectation, mu Y squared. This is the variance of Y. And then we have two times the expectation of X,
01:19:40.000 --> 01:19:47.000
Y minus the product of the expectation of Y with the expectation of X. So that is two times
01:19:47.000 --> 01:19:54.000
the covariance. So the variance of the sum of two random variables is V of X plus two times the
01:19:54.000 --> 01:20:08.000
covariance of X and Y plus V of Y. Okay. The next thing I have to present to you is the
01:20:09.000 --> 01:20:16.000
well-known Cauchy-Schwarz inequality. And since we are running out of time, I will just state this
01:20:16.000 --> 01:20:24.000
Cauchy-Schwarz inequality here to more or less sum up what we've done so far, but I will present
01:20:24.000 --> 01:20:32.000
the proof only in the next lecture. So the Cauchy-Schwarz inequality relates the size of the covariance
01:20:32.000 --> 01:20:41.000
between two random variables to the variances of the two random variables. This is a very general
01:20:41.000 --> 01:20:48.000
inequality. Let X and Y be any random variables. Then it is always true that the covariance of
01:20:48.000 --> 01:20:58.000
X and Y squared is less than or equal to the product of the variance of X with the variance of Y.
01:20:59.000 --> 01:21:06.000
So we always have an upper bound for the covariance of two variables. You may equally well take this
01:21:06.000 --> 01:21:14.000
inequality here and take the square root of both sides. You would define the covariance of X and Y.
01:21:14.000 --> 01:21:22.000
It's always less than or equal than the product of the two standard deviations of X and Y.
01:21:23.000 --> 01:21:28.000
And the proof, as I said, I will present in the next lecture because this is slightly more
01:21:28.000 --> 01:21:34.000
complicated than the proofs we've gone through so far. For the last two or three minutes,
01:21:34.000 --> 01:21:41.000
which we have, I just would like to ask you if there are any more questions concerning the
01:21:41.000 --> 01:21:47.000
material I have covered in the lecture so far. If so, please wave to me. Raise your hand.
01:21:48.000 --> 01:21:56.000
And if somebody raises his or her hand in the next seconds, then please send me a question.
01:21:57.000 --> 01:22:06.000
If I do see anybody waving, as it seems to be the case, then I infer that no questions are
01:22:07.000 --> 01:22:14.000
pending. Yes, it seems to be like this. So I hope that everything,
01:22:14.000 --> 01:22:16.000
somebody's coming here on the chat.
01:22:16.000 --> 01:22:26.000
Oh, very good. Yes, I would have forgotten. Thank you very much. You wanted to remind me to do a
01:22:26.000 --> 01:22:33.000
vote at the 15 minute earlier start or finish. That's true. Actually, I'm not sure if my idea
01:22:33.000 --> 01:22:38.000
of vote was a very good vote because I received I think at least two messages which said that there
01:22:38.000 --> 01:22:45.000
is another lecture which finishes just at four o'clock sharp. And this implies actually that we
01:22:45.000 --> 01:22:53.000
cannot start at four o'clock sharp. These students need to have at least some minutes of rest.
01:22:54.000 --> 01:23:01.000
So we cannot actually by majority vote decide that they won't have their rest or would have to come
01:23:01.000 --> 01:23:08.000
late to the lecture. And given that some of you are affected by a lecture which lasts till
01:23:09.000 --> 01:23:15.000
four o'clock sharp, we can only start at 15. Sorry, we cannot have a vote on that.
01:23:16.000 --> 01:23:22.000
Second, is there also a chance that we could vote on interactive lecture like our other lectures?
01:23:24.000 --> 01:23:27.000
I'm not quite sure how you mean this, but let me read the rest of that.
01:23:27.000 --> 01:23:31.000
Without all the other students, it feels so empty in here. I would definitely prefer to have a more
01:23:31.000 --> 01:23:38.000
qualitative interactive lecture with a community feeling than a recorded one. Yes, indeed, I would
01:23:38.000 --> 01:23:44.000
very much appreciate that, but I don't really see how we can do this. I am required to use
01:23:44.000 --> 01:23:53.000
the webinar format and the university seems to have restricted the use of Zoom webinars such
01:23:53.000 --> 01:24:00.000
that no pictures of students are shown to anybody else. I don't see you. You don't see each other.
01:24:00.000 --> 01:24:06.000
You just see me, but I have no idea how you look like. I would very much like to have direct
01:24:06.000 --> 01:24:12.000
communication with you, but that is not permitted by the university. And I'm quite surprised that
01:24:12.000 --> 01:24:22.000
you say that there are other interactive lectures which apparently use more communication and
01:24:22.000 --> 01:24:26.000
interaction between students and professor. Perhaps you could inform me how this works in
01:24:26.000 --> 01:24:34.000
other lectures and what colleagues do there. And then there is somebody asking if the Zoom link
01:24:34.000 --> 01:24:39.000
is permanent. Yes, indeed. The Zoom link should be permanent. I hope that it works that way. Sorry,
01:24:41.000 --> 01:24:48.000
this is the first time I'm doing it this way. And last semester, I was asked to do it in asynchronous
01:24:49.000 --> 01:24:55.000
video format. I suppose the Zoom link is permanent. If it does not work for future
01:24:55.000 --> 01:25:05.000
sessions, then we probably realize that. Oh, yes, good. The one participant here
01:25:07.000 --> 01:25:13.000
says that other professors don't record the lecture. And then this allows you to turn,
01:25:13.000 --> 01:25:21.000
or would allow me to turn on camera microphones. And this is true, but I'm not sure if this is
01:25:21.000 --> 01:25:31.000
wanted. Yes, the same comment again. Other professors don't record. Perhaps you can also
01:25:31.000 --> 01:25:40.000
tell me your opinion on that. I think actually it is an advantage to record the lecture
01:25:42.000 --> 01:25:49.000
because you can then re-listen what I have said, or can attend the lecture if it takes place at
01:25:49.000 --> 01:25:59.000
the point in time when you don't find the time to come, either once or always. So that was why
01:25:59.000 --> 01:26:04.000
I think that given that the lecture is digital, it is perhaps better to have the recording.
01:26:04.000 --> 01:26:11.000
But if there is a broad consensus that you would prefer to have the lecture not recorded and rather
01:26:11.000 --> 01:26:17.000
see each other in more active format, then I would not insist on the recording. I mean,
01:26:17.000 --> 01:26:22.000
this is service to you, the recording, and nothing which I particularly need to have.
01:26:26.000 --> 01:26:35.000
Now somebody else comes with a middle part where we may interact. I'm not sure if that is good.
01:26:39.000 --> 01:26:43.000
Let me think about that, but this costs extra time, and I'm short on time in this lecture anyway.
01:26:43.000 --> 01:26:49.000
So I would not spontaneously subscribe to this idea.
01:26:53.000 --> 01:26:57.000
Now somebody else comes up and says, no, please keep recording your lectures.
01:27:03.000 --> 01:27:11.000
Perhaps I will think about the suggestion of having some interactive part at the end of
01:27:12.000 --> 01:27:22.000
lecture. If you perhaps could also think how valuable you find the recording of the lecture,
01:27:22.000 --> 01:27:29.000
and perhaps also discuss this with other participants, because part of the problem is
01:27:29.000 --> 01:27:37.000
that currently there's only a chair of the registered students in this lecture.
01:27:38.000 --> 01:27:44.000
If I recall correctly, yes, we have 35 participants currently, and there are 15 participants
01:27:44.000 --> 01:27:50.000
which are not attending the lecture. You may say that's their own fault, but perhaps they also
01:27:50.000 --> 01:27:54.000
have reasons why they cannot attend this lecture. Perhaps they attend some other lecture, which is
01:27:54.000 --> 01:28:01.000
at the same time or something like this. So it is a little difficult to decide by some type of vote
01:28:01.000 --> 01:28:11.000
now among those people who have the time to attend the lecture Thursday 4 to 6 and not have
01:28:11.000 --> 01:28:23.000
the opinion of others. I think what I will do is that I send to all participants an email in which
01:28:23.000 --> 01:28:30.000
I ask them on whether they would prefer to have a completely interactive, not recorded lecture.
01:28:31.000 --> 01:28:39.000
And then I would like everybody of you to reply to this email, please via the
01:28:39.000 --> 01:28:47.000
steener reply function so that I have all the replies in the same folder. And we see how the
01:28:47.000 --> 01:28:56.000
feeling among you and those participants who are not currently in the lecture is. And I will think
01:28:56.000 --> 01:29:04.000
then again on whether we could, if sizable part of the audience prefers to have recorded lectures,
01:29:07.000 --> 01:29:14.000
if we could have some interactive time slot within the lecture, an idea of which I'm not
01:29:14.000 --> 01:29:20.000
too sympathetic currently, but I will give it a thought. And so for the time being, please
01:29:20.000 --> 01:29:26.000
just reply to the email which I will send you soon via the steener on whether the recording
01:29:26.000 --> 01:29:35.000
is of purpose to you or whether it is not. And then we can talk about this in the next lecture
01:29:35.000 --> 01:29:47.000
on Tuesday. There are a number of comments coming in. Let me just scan through some of them.
01:29:49.000 --> 01:29:53.000
Professor Kifman uses such a problem as I described above. Now I don't know who that
01:29:53.000 --> 01:29:59.000
is who described something above. What he does is pause, ask questions with the poll option,
01:29:59.000 --> 01:30:07.000
then returns to the recording. We all agree recording regarding recording can be interactive
01:30:07.000 --> 01:30:13.000
with the recording. Yes, in principle it can, but then again I have to make sure after each lecture
01:30:13.000 --> 01:30:19.000
that all participants agree with having this thing recorded. If just one says he does not agree,
01:30:20.000 --> 01:30:24.000
then I cannot use the recording. So this is basically impractical for
01:30:26.000 --> 01:30:34.000
data protection reasons, unfortunately. Here is again the suggestion at the end of the lecture,
01:30:34.000 --> 01:30:41.000
perhaps some discussion. This I will think over as somebody who would like to have recordings,
01:30:42.000 --> 01:30:46.000
somebody who asked whether I have decided when to start. Yes, I said this already.
01:30:46.000 --> 01:30:54.000
We start at 4.15 now since some people have difficulties with other lectures. Somebody else
01:30:54.000 --> 01:31:00.000
who says that recordings are important and some other suggestion that perhaps we have
01:31:00.000 --> 01:31:05.000
just a tiny little bit of non-recording and interaction. And I will think about that.
01:31:05.000 --> 01:31:13.000
Okay, I think this is currently it's what I receive in terms of messages. Thank you very much
01:31:13.000 --> 01:31:21.000
for your attention so far and hopefully we see each other or at least you see me and I perhaps
01:31:21.000 --> 01:31:37.000
read from you on Tuesday. I wish you a good rest of the week and a good weekend and then until Tuesday.