(R) Two sample t-test (Student's t, Welch's t )

<Two Sample T-test>

Two sample t-test is to test whether the two population means are equal or not.

So the null hypothesis here is, H0: the means of two populations are equal (M(A) =M(B)).

The alternative hypothesis here is, H1: the means of two populations are different (M(A) !=M(B)).

The common assumption for the two tests is "the sampling distributions are normally distributed."

That is, both groups have the normal distribution for their sample mean. And as we have the great theorem which is "Central limit theorem," we can use both tests even when the groups have non-normal distributions when the sample sizes are large enough.

There are two types of t-tests, Student's t-test and Welch's t-test.

The difference between the tests is assumption about variance of two samples. For Student's t-test, we assume the variances of two populations are equal. In contrast, we assume the variances of two populations are not equal for Welch's t-test.

So if we have two populations, which have same variances, we prefer to use Student's t-test, and for the different variances, we are willing to use Welch's t-test.

<R>

> library(MASS)

> head(painters)

> View(painters)

> set.seed(3)

> B=which(painters$School=="B")

> B

[1] 11 12 13 14 15 16

> C=which(painters$School=="C")

> C

[1] 17 18 19 20 21 22

> Group_B=painters[B,]

> Group_B

> Group_C=painters[C,]

> Group_C

> Com=c(Group_B$Composition,Group_C$Composition)

> Com

[1] 10 13 10 15 13 12 14 16 10 13 11 15

> School=c(Group_B$School,Group_C$School)

> School

[1] 2 2 2 2 2 2 3 3 3 3 3 3

> d_1=as.data.frame(cbind(Com,School))

> d_1

Com School

1 10 2

2 13 2

3 10 2

4 15 2

5 13 2

6 12 2

7 14 3

8 16 3

9 10 3

10 13 3

11 11 3

12 15 3

<Student's T-TEST>

> t.test(Com~School,data=d_1,var.equal=TRUE)

Two Sample t-test

data: Com by School

t = -0.81051, df = 10, p-value = 0.4365

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-3.749041 1.749041

sample estimates:

mean in group 2 mean in group 3

12.16667 13.16667

<Welch's T-TEST>

> t.test(Com~School,data=d_1,var.equal=FALSE)

Welch Two Sample t-test

data: Com by School

t = -0.81051, df = 9.7022, p-value = 0.4371

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-3.76052 1.76052

sample estimates:

mean in group 2 mean in group 3

12.16667 13.16667

<Interpretation>

<Student's T-TEST>

> t.test(Com~School,data=d_1,var.equal=TRUE) ->This is for student's t-test which need settings "equal variances."

Two Sample t-test

data: Com by School

t = -0.81051, df = 10, p-value = 0.4365 ->p-value does not seem low enough, so we do not have the reason for reject the null hypothesis.

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval ->the interval which makes 95% that the confidence interval we calculated contains the true population mean. That is, [-3.749041, 1.749041] is the 95 percent confidence Interval. As the sample size increases, the range of interval values gets shorter, which means higher accuracy compared to small sample.

-3.749041 1.749041

sample estimates:

mean in group 2 mean in group 3

12.16667 13.16667

<Welch's T-TEST>

> t.test(Com~School,data=d_1,var.equal=FALSE) ->This is for Welch's t-test which need settings "unequal variances."

Welch Two Sample t-test

data: Com by School

t = -0.81051, df = 9.7022, p-value = 0.4371 ->p-value does not seem low enough, so we do not have the reason for reject the null hypothesis.

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval: ->the interval which makes 95% that the confidence interval we calculated contains the true population mean. That is, [-3.76052, 1.76052] is the 95 percent confidence Interval. As the sample size increases, the range of interval values gets shorter, which means higher accuracy compared to small sample.

-3.76052 1.76052

sample estimates:

mean in group 2 mean in group 3

12.16667 13.16667

저작자표시

'R' 카테고리의 다른 글

(R) Extract rows and columns from data frame with conditions (0)	2020.11.05
(R)Wilkinson dot plot (+boxplot) (0)	2020.10.29
(R) Permutation Test - Non parametric method (0)	2020.10.24
(R) Two sample Bootstrap Method (0)	2020.10.23
(R) One-sample Bootstrap Method (0)	2020.10.17

Jangpiano Science

(R) Two sample t-test (Student's t, Welch's t )

'R' 카테고리의 다른 글

티스토리툴바

(R) Two sample t-test (Student's t, Welch's t )

'R' 카테고리의 다른 글

관련글

티스토리툴바