본문 바로가기
R

(R) ANOVA TEST

by jangpiano 2020. 11. 7.
반응형

<ANOVA TEST> 


ANOVA TEST is abbreviation of 'analysis of variance.' I can explain it efficiently comparing with T-test. We use T-test in decision whether to accept the alternative hypothesis or not. Normally, we use T-test dealing with two groups which becomes the subject of comparison. When the number of groups for comparisons is more than 2, the type 1 error(significance level) becomes larger in multiple. So it is inappropriate to use determine the comparison for more than two groups by several consecutive t-tests. However, what if we intend to compare more than two groups? That is the point where ANOVA TEST come into it's own. 

T-TEST  

 ANOVA TEST 


object: reject null hypothesis (accept alternative hypothesis.)

compares the means of two groups. 


H0: M1=M2

H1:  M1!=M2


Assumption: sampling distributions are normally distributed. 

                     same population variances(student t-test) 

                     different population variances(welch's t-test) 

                     Independence of observations. 

object: reject the null hypothesis (accept alternative hypotheis.)

compares means of more than two groups (multiple groups.)


H0: M1=M2=M3=...=Mn

H1: at least two of the population means are unequal.  


Assumption: dependent factors of each group are normally distributed. 

                     same population variances. 

                     Independence of observations. 


<Types of ANOVA TEST> 

There are three types in ANOVA TEST divided by 'the number of dependent factors.' What dependent factor means here is factors which affect values to be measured. That is, if we want to measure two groups divided by a different factor, we use ONE-WAY ANOVA. For example, we want to compare the score of multiple groups of students divided into the studying time. For this case, the score of students become dependent factor and we regard the studying time as an independent factor. When we dealing with two independent factors, such as studying time and studying place, we use TWO-WAY ANOVA TEST and N-WAY ANOVA for multiple independent factors. 


ONE-WAY ANOVA

 TWO-WAY ANOVA

N-WAY ANOVA

 A independent factor. 

 More than two factor levels. 

 ex) independent factor: studying time.

       factor level- 3 hours<t<5hours 

                         - 5 hours<t<10 hours

                         - 10 hours<t<15hours

     dependent factors: GPA


For this example, we want to compare the studying time of undergraduate students divided by 3 different universities

 Two independent factors. 

    ex) independent factors: studying time, sleeping time. 

          dependent factors: GPA

Multiple independent factors.

      ex) studying time, place, sleeping time. 


<TEST DESIGN> 

1. Independent individuals --> randomly split them into smaller groups --> put each group in different conditions(independent factors) --> measure outcome. 

2. Independent individuals --> divide them by attribute independent variables --> measure outcome.

    * attribute independent variables: some attributes that the individuals possess, which is not for artificially divided. 


<ANOVA MODEL> 







<R>

> table(painters$School)


 A  B  C  D  E  F  G  H 

10  6  6 10  7  4  7  4 


> painters2=painters[is.element(painters$School,c("A","B","C","D")),c("Drawing","School")]

> painters2

                Drawing School

Da Udine              8      A

Da Vinci             16      A

Del Piombo           13      A

Del Sarto            16      A

Fr. Penni            15      A

Guilio Romano        16      A

Michelangelo         17      A

Perino del Vaga      16      A

Perugino             12      A

Raphael              18      A

F. Zucarro           13      B

Fr. Salviata         15      B

Parmigiano           15      B

Primaticcio          14      B

T. Zucarro           14      B

Volterra             15      B

Barocci              15      C

Cortona              14      C

Josepin              10      C

L. Jordaens          12      C

Testa                15      C

Vanius               15      C

Bassano               8      D

Bellini               6      D

Giorgione             9      D

Murillo               8      D

Palma Giovane         9      D

Palma Vecchio         6      D

Pordenone            14      D

Tintoretto           14      D

Titian               15      D

Veronese             10      D



> aov(Drawing~School,data=painters2)

Call:

   aov(formula = Drawing ~ School, data = painters2)


Terms:

                  School Residuals

Sum of Squares  136.8854  201.8333

Deg. of Freedom        3        28


Residual standard error: 2.684834

Estimated effects may be unbalanced


> summary(aov(Drawing~School,data=painters2))

            Df Sum Sq Mean Sq F value  Pr(>F)   

School       3  136.9   45.63    6.33 0.00205 **

Residuals   28  201.8    7.21                   

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


> bartlett.test(Drawing~School,data=painters2)


Bartlett test of homogeneity of variances


data:  Drawing by School

Bartlett's K-squared = 8.4701, df = 3, p-value = 0.03723 ------->reject the null hypothesis(accept alternative hypothesis.)   









반응형