<ANOVA TEST>
ANOVA TEST is abbreviation of 'analysis of variance.' I can explain it efficiently comparing with T-test. We use T-test in decision whether to accept the alternative hypothesis or not. Normally, we use T-test dealing with two groups which becomes the subject of comparison. When the number of groups for comparisons is more than 2, the type 1 error(significance level) becomes larger in multiple. So it is inappropriate to use determine the comparison for more than two groups by several consecutive t-tests. However, what if we intend to compare more than two groups? That is the point where ANOVA TEST come into it's own.
T-TEST |
ANOVA TEST |
object: reject null hypothesis (accept alternative hypothesis.) compares the means of two groups. H0: M1=M2 H1: M1!=M2 Assumption: sampling distributions are normally distributed. same population variances(student t-test) different population variances(welch's t-test) Independence of observations. |
object: reject the null hypothesis (accept alternative hypotheis.) compares means of more than two groups (multiple groups.) H0: M1=M2=M3=...=Mn H1: at least two of the population means are unequal. Assumption: dependent factors of each group are normally distributed. same population variances. Independence of observations. |
<Types of ANOVA TEST>
There are three types in ANOVA TEST divided by 'the number of dependent factors.' What dependent factor means here is factors which affect values to be measured. That is, if we want to measure two groups divided by a different factor, we use ONE-WAY ANOVA. For example, we want to compare the score of multiple groups of students divided into the studying time. For this case, the score of students become dependent factor and we regard the studying time as an independent factor. When we dealing with two independent factors, such as studying time and studying place, we use TWO-WAY ANOVA TEST and N-WAY ANOVA for multiple independent factors.
ONE-WAY ANOVA |
TWO-WAY ANOVA |
N-WAY ANOVA |
A independent factor. More than two factor levels. ex) independent factor: studying time. factor level- 3 hours<t<5hours - 5 hours<t<10 hours - 10 hours<t<15hours dependent factors: GPA For this example, we want to compare the studying time of undergraduate students divided by 3 different universities. |
Two independent factors. ex) independent factors: studying time, sleeping time. dependent factors: GPA |
Multiple independent factors. ex) studying time, place, sleeping time. |
<TEST DESIGN>
1. Independent individuals --> randomly split them into smaller groups --> put each group in different conditions(independent factors) --> measure outcome.
2. Independent individuals --> divide them by attribute independent variables --> measure outcome.
* attribute independent variables: some attributes that the individuals possess, which is not for artificially divided. .
<ANOVA MODEL>
<R>
> table(painters$School)
A B C D E F G H
10 6 6 10 7 4 7 4
> painters2=painters[is.element(painters$School,c("A","B","C","D")),c("Drawing","School")]
> painters2
Drawing School
Da Udine 8 A
Da Vinci 16 A
Del Piombo 13 A
Del Sarto 16 A
Fr. Penni 15 A
Guilio Romano 16 A
Michelangelo 17 A
Perino del Vaga 16 A
Perugino 12 A
Raphael 18 A
F. Zucarro 13 B
Fr. Salviata 15 B
Parmigiano 15 B
Primaticcio 14 B
T. Zucarro 14 B
Volterra 15 B
Barocci 15 C
Cortona 14 C
Josepin 10 C
L. Jordaens 12 C
Testa 15 C
Vanius 15 C
Bassano 8 D
Bellini 6 D
Giorgione 9 D
Murillo 8 D
Palma Giovane 9 D
Palma Vecchio 6 D
Pordenone 14 D
Tintoretto 14 D
Titian 15 D
Veronese 10 D
> aov(Drawing~School,data=painters2)
Call:
aov(formula = Drawing ~ School, data = painters2)
Terms:
School Residuals
Sum of Squares 136.8854 201.8333
Deg. of Freedom 3 28
Residual standard error: 2.684834
Estimated effects may be unbalanced
> summary(aov(Drawing~School,data=painters2))
Df Sum Sq Mean Sq F value Pr(>F)
School 3 136.9 45.63 6.33 0.00205 **
Residuals 28 201.8 7.21
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> bartlett.test(Drawing~School,data=painters2)
Bartlett test of homogeneity of variances
data: Drawing by School
Bartlett's K-squared = 8.4701, df = 3, p-value = 0.03723 ------->reject the null hypothesis(accept alternative hypothesis.)