字幕表 動画を再生する
Before we understand this concept, we need to explain what a transformation is.
So, a transformation is a way in which we can alter every element of a distribution
to get a new distribution with similar characteristics.
For Normal Distributions we can use addition, subtraction, multiplication and division without
changing the type of the distribution.
For instance, if we add a constant to every element of a Normal distribution, the new
distribution would still be Normal.
Let’s discuss the four algebraic operations and see how each one affects the graph.
If we add a constant, like 3, to the entire distribution, then we simply need to move
the graph 3 places to the right.
Similarly, if we subtract a number from every element, we would simply move our current
graph to the left to get the new one.
If we multiply the function by a constant it will widen that many times and if we divide
every element by a number, the graph will shrink.
However, if we multiply or divide by a number between 0 and 1, the opposing effects will
occur.
For example, dividing by a half, is the same as multiplying by 2, so the graph would expand,
even though we are dividing.
Alright!
Now that you know what a transformation is, we can explain standardizing.
Standardizing is a special kind of transformation in which we make the expected value equal
to 0 and the variance equal to 1.
The benefit of doing so, is that we can then use the cumulative distribution table from
last lecture on any element in the set.
The distribution we get after standardizing any Normal distribution, is called a “Standard
Normal Distribution”.
In addition to the “68, 95, 99.7” rule, there exists a table which summarizes the
most commonly used values for the CDF of a Standard Normal Distribution.
This table is known as the Standard Normal Distribution table or the “Z”- score table.
Okay!
So far, we learned what standardizing is and why it is convenient.
What we haven’t talked about is how to do it.
First, we wish to move the graph either to the left, or to the right until its mean equals
0.
The way we would do that is by subtracting the mean “mu” from every element.
After this to make the standardization complete, we need to make sure the standard deviation
is 1.
To do so, we would have to divide every element of the newly obtained distribution by the
value of the standard deviation, sigma.
If we denote the Standard Normal Distribution with Z, then for any normally distributed
variable Y, “Z equals Y minus mu, over sigma”.
This equation expresses the transformation we use when standardizing.
Amazing!
Applying this single transformation for any Normal Distribution would result in a Standard
Normal Distribution, which is convenient.
Essentially, every element of the non-standardized distribution is represented in the new distribution
by the number of standard deviations it is away from the mean.
For instance, if some value y is 2.3 standard deviations away from the mean, its equivalent
value “Z” would be equal to 2.3.
Standardizing is incredibly useful when we have a Normal Distribution, however we cannot
always anticipate that the data is spread out that way.
A crucial fact to remember about the Normal distribution is that it requires a lot of
data.
If our sample is limited, we run the risk of outliers drastically affecting our analysis.
In cases where we have less than 30 entries, we usually avoid assuming a Normal distribution.
However, there exists a small sample size approximation of a Normal distribution called
the Students’ T distribution and we are going to focus on it in our next lecture.
Thanks for watching.