字幕表 動画を再生する
Let's say you want to find out
if the beverage that people drink
affects their reaction time.
So you set up an experiment with three groups of people.
The first group gets water to drink.
The second group gets some sugary fruit juice,
and the third group gets coffee.
Now you test everyone's reaction time.
And you want to know if there's any difference
in reaction time between the groups.
The null hypothesis says that the mean reaction time
for all three groups is the same.
If there were only two groups,
you could use a t-test to find out
if there's a difference between them.
But when you have three groups or more,
you need to use a different approach--
the analysis of variance.
When you do the experiment,
the scores won't all be the same.
The total variation of all the scores
is made up of two parts:
The variation within each group,
because the people in each group
have different reaction times,
and the variation between the groups,
because the drinks you gave each group are different.
Here's an example.
Look at this set of scores.
They've been sorted into order
to make it easier to see the patterns.
You can see that there's a lot of variation
within each group;
some people are faster and some are much slower.
But all the groups look pretty much alike;
there's not much variation between the groups.
In this case, you'd say that most of the difference
is due to the people, and the drink
didn't make much of a difference.
You can't reject the null hypothesis;
which is that the type of drink doesn't have
any effect on reaction time.
Now let's look at a different set of numbers.
In this case, all the scores within each group
are very close to one another.
There's not a lot of variance within each group,
but the groups are very different from one another.
There's a lot of difference between the groups.
In this case, you would reject the null hypothesis.
In this case, the type of drink makes a big difference.
So here's the idea behind analysis of variance:
Figure out how much of the total variance comes from the between-groups variance
and the within-groups variance.
Take the ratio of between-groups
to within-groups variance,
and the larger this number is,
the more likely it is
that the means of the groups really *are* different,
and that you should reject the null hypothesis.
In the examples, it was obvious where the variance was.
Now look at these numbers.
You probably can't tell
if there's a significant effect
because it's not clear whether there's
more variance within groups or between groups,
or how much.
The calculations show that the ratio is 4.27,
which has a probability of .04,
so in this case, you can reject the null hypothesis.
With these numbers, the drink you give the people
does have an effect on their reaction time.
What's that "2,12" doing there?
Those are the degrees of freedom
for variance between groups
and variance within groups.
And here's how you calculate the degrees of freedom
when you report results for analysis of variance.
This trick of separating the variance
not only when you have three or more groups,
it also works when you have multiple variables.
For example, if you test three groups
for reaction time in the morning,
and you test another three groups in the evening,
an analysis of variance can tell you
if there's a significant effect
for the type of drink,
or if the time of day makes a difference,
or if there's some interaction.
For example, coffee might be more effective in the morning than in the evening.
So to recap, here's the main idea of analysis of variance:
You figure hοw much of the total variance
comes from between the groups,
and how much comes from within the groups.
If most of the variation is between groups,
there's probably a significant effect;
if most of the variation is within groups,
there's probably not a significant effect.