본문 바로가기
R

ways to make boxplot (one variable, two variables in x-axis) / boxplot(),qplot(),ggplot()

by jangpiano 2020. 8. 11.
반응형

<Three ways to make boxplots>


When making boxplots, there is the critical point you have to keep in mind. 


Boxplots will be successfully made when the variable of x-axis is not numeric data. 

So variable in x-axis should be factor type of character type when making boxplots. 


> str(msleep$vore)

 chr [1:83] "carni" "omni" "herbi" "omni" "herbi" "herbi" ...


> table(msleep$vore)


  carni   herbi insecti    omni 

     19      32       5      20


The variable 'vore' in msleep data is a character type. So we can make a boxplot with 'vore' as the variable of the x-axis. 


1. boxplot() ---- boxplot(data= , variable of y-axis ~ variable of x-axis


> boxplot(data=msleep,sleep_total~vore)



2. qplot()


> qplot(msleep$vore,msleep$sleep_total,geom="boxplot")



2. qplot() without NA (using filter function)


> table(is.na(msleep$vore))


FALSE  TRUE 

   76     7 


> msleep2<-msleep%>%filter(!is.na(vore))

> table(is.na(msleep2$vore))


FALSE 

   76 

> qplot(msleep2$vore,msleep2$sleep_total,geom="boxplot")



3. ggplot()+geom_boxplot()


> ggplot(data=msleep,aes(x=vore,y=sleep_total))+geom_boxplot()


2. ggplot() without NA (using filter function)


> table(is.na(msleep$vore))


FALSE  TRUE 

   76     7 


> msleep2<-msleep%>%filter(!is.na(vore))

> table(is.na(msleep2$vore))


FALSE 

   76



> ggplot(data=msleep2,aes(x=vore,y=sleep_total))+geom_boxplot()



<Boxplots with more than one variable in x-axis >


1. boxplot()


> boxplot(sleep_total~vore+conservation,data=msleep)

2. qplot()

> qplot(interaction(msleep$vore,msleep$conservation),msleep$sleep_total,geom="boxplot")

3. ggplot()


> ggplot(data=msleep,aes(x=interaction(vore,conservation),y=sleep_total))+geom_boxplot()



<adjust the size and the shape of outliers> 


The size and shape of outliers are set as 2 and 16 respectively. 

You can adjust the size and shape of outliers by using geom_boxplot(outlier.size=, outlier.shape=)


> ggplot(data=msleep,aes(x=vore,y=sleep_total))+geom_boxplot(outlier.size=1.5)

> ggplot(data=msleep,aes(x=vore,y=sleep_total))+geom_boxplot(outlier.size=1.5,outlier.shape=1)

> ggplot(data=mpg, aes(x=drv, y=hwy))+geom_boxplot(alpha=0.3, color='blue')
> ggplot(data=mpg, aes(x=drv, y=hwy))+geom_boxplot(colour='blue', fill='red')


<The way to make a boxplot with a variable> 


> ggplot(data=msleep,aes(x=1,y=sleep_total))+geom_boxplot()

> ggplot(data=msleep,aes(x=sleep_total,y=1))+geom_boxplot()


>ggplot(data=mpg, aes(x=drv, y=hwy))+geom_boxplot(aes(colour=fl))

> ggplot(data=mpg, aes(x=drv, y=hwy))+geom_boxplot(aes(colour=fl))+xlim(4,'f')+ylim(15,35)


<The way to remove outliers>

> ggplot(data=var1,aes(X=1,y=var1))+geom_boxplot(outlier.shape=NA)


반응형