본문 바로가기
R

Ways to draw histogram, adjust the width of boxes/ hist(),ggplot(),qplot()

by jangpiano 2020. 8. 11.
반응형

<Three ways to make a histogram>


I will make a histogram from data 'airqulity' which is built in the ggplot2 package. 

histogram is a graphical display of distribution of data which consists of nonoverlapping bins. 

Generally, y-axis of histograms which is the height of each bin in the histogram is proportional to frequencies of the number of cases of the unit of the variable in x-axis. 


The graphs below represents the distribution of Temperature of New York, May to September 1973. 


> table(airquality$Month)


 5  6  7  8  9 

31 30 31 31 30 


1. hist()


> hist(airquality$Temp)

 > hist(airquality$Temp,breaks=5)

 

 




2. qplot()


>library(ggplot2)

> qplot(airquality$Temp)

 > qplot(airquality$Temp,binwidth=5)

 

 





3. ggplot()+geom_histogram()


 >library(ggplot2)

> ggplot(data=airquality, aes(x=Temp))+geom_histogram()

 >ggplot(data=airquality,aes(x=Temp))+geom_histogram(binwidth=5)

 

 


<Two ways to adjust the width of boxes when using geom_histogram()>

> ggplot(data=airquality,aes(x=Temp))+geom_histogram()

The number of bins in geom_histogram() is 30 in normal. 


<two ways to make 9 bins in the graph> 


> ggplot(data=airquality,aes(x=Temp))+geom_histogram(binwidth=5)

 > ggplot(data=airquality,aes(x=Temp))+geom_histogram(binwidth=diff(range(airquality$Temp)/8))


In this way, you can have 9 bins in total. 

when dividing bins with the range of the graph, The number of bins becomes the number used as a division of the range +1. 

For this case, you divide the range of airquality$Temp in 8 times, so you have 9 bins in total. 

 

 






반응형