Group the variable according to its value and get a histogram
I'm trying to group the variable according to its values and get a histogram.
For example, this is my data:
r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1, 3023,76336,13,003,9898,1,10,843,10546,617,1375,1,1,5679,1,21,1,13,6,28,1,14088,682)
I want to group r by its value, like: 1-5, 5-10, 10-100, 100-500 and more than 500. And then I want to get a histogram which the x axis is in the type of interval (1-5,5-10,10-100,100-500 and more than 500) . How to solve that?
If I want to use le package ggplot2, code as following:
ggplot(data=r, aes(x=r))+geom_histogram(breaks = c(1, 5, 10, 100, 500,2000,Inf))
It dosen't work and R says that "missing value where TRUE/FALSE needed". And how to make the larges of bins are the same?
In base R
r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1,5, 3023,76336,13,003,9898,1,10,843,10546,617,1375,1,1,5679,1,21,1,13,6,28,1,14088,682) cut.vals <- cut(r, breaks = c(1, 5, 10, 100, 500, Inf), right = FALSE) xy <- data.frame(r, cut = cut.vals) barplot(table(xy$cut))
Note that I added the xy variable to ease in comparing how values were grouped. You can directly put cut.vals into the barplot(table()).
To use ggplot2, you can pre-calculate all the bins and plot
ggplot(xy, aes(x = cut)) + theme_bw() + geom_bar() + scale_x_discrete(drop = FALSE)
geom_histogram's most common parameter that controls bin size is binwidth, which is constant for all bins.