<Three ways to make boxplots>
When making boxplots, there is the critical point you have to keep in mind.
Boxplots will be successfully made when the variable of x-axis is not numeric data.
So variable in x-axis should be factor type of character type when making boxplots.
> str(msleep$vore)
chr [1:83] "carni" "omni" "herbi" "omni" "herbi" "herbi" ...
> table(msleep$vore)
carni herbi insecti omni
19 32 5 20
The variable 'vore' in msleep data is a character type. So we can make a boxplot with 'vore' as the variable of the x-axis.
1. boxplot() ---- boxplot(data= , variable of y-axis ~ variable of x-axis)
> boxplot(data=msleep,sleep_total~vore)
2. qplot()
> qplot(msleep$vore,msleep$sleep_total,geom="boxplot")
2. qplot() without NA (using filter function)
> table(is.na(msleep$vore))
FALSE TRUE
76 7
> msleep2<-msleep%>%filter(!is.na(vore))
> table(is.na(msleep2$vore))
FALSE
76
> qplot(msleep2$vore,msleep2$sleep_total,geom="boxplot")
3. ggplot()+geom_boxplot()
> ggplot(data=msleep,aes(x=vore,y=sleep_total))+geom_boxplot()
2. ggplot() without NA (using filter function)
> table(is.na(msleep$vore))
FALSE TRUE
76 7
> msleep2<-msleep%>%filter(!is.na(vore))
> table(is.na(msleep2$vore))
FALSE
76
> ggplot(data=msleep2,aes(x=vore,y=sleep_total))+geom_boxplot()
<Boxplots with more than one variable in x-axis >
1. boxplot()
> boxplot(sleep_total~vore+conservation,data=msleep)
2. qplot()
> qplot(interaction(msleep$vore,msleep$conservation),msleep$sleep_total,geom="boxplot")
3. ggplot()
> ggplot(data=msleep,aes(x=interaction(vore,conservation),y=sleep_total))+geom_boxplot()
<adjust the size and the shape of outliers>
The size and shape of outliers are set as 2 and 16 respectively.
You can adjust the size and shape of outliers by using geom_boxplot(outlier.size=, outlier.shape=)
> ggplot(data=msleep,aes(x=vore,y=sleep_total))+geom_boxplot(outlier.size=1.5)
> ggplot(data=msleep,aes(x=vore,y=sleep_total))+geom_boxplot(outlier.size=1.5,outlier.shape=1)
> ggplot(data=mpg, aes(x=drv, y=hwy))+geom_boxplot(alpha=0.3, color='blue')
> ggplot(data=mpg, aes(x=drv, y=hwy))+geom_boxplot(colour='blue', fill='red')
<The way to make a boxplot with a variable>
> ggplot(data=msleep,aes(x=1,y=sleep_total))+geom_boxplot()
> ggplot(data=msleep,aes(x=sleep_total,y=1))+geom_boxplot()
>ggplot(data=mpg, aes(x=drv, y=hwy))+geom_boxplot(aes(colour=fl))
> ggplot(data=mpg, aes(x=drv, y=hwy))+geom_boxplot(aes(colour=fl))+xlim(4,'f')+ylim(15,35)
<The way to remove outliers>
> ggplot(data=var1,aes(X=1,y=var1))+geom_boxplot(outlier.shape=NA)