# Group the variable according to its value and get a histogram

I'm trying to group the variable according to its values and get a histogram.

For example, this is my data:

r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1, 3023,76336,13,003,9898,1,10,843,10546,617,1375,1,1,5679,1,21,1,13,6,28,1,14088,682)

I want to group r by its value, like: 1-5, 5-10, 10-100, 100-500 and more than 500. And then I want to get a histogram which the x axis is in the type of interval (1-5,5-10,10-100,100-500 and more than 500) . How to solve that?

If I want to use le package ggplot2, code as following:

ggplot(data=r, aes(x=r))+geom_histogram(breaks = c(1, 5, 10, 100, 500,2000,Inf))

It dosen't work and R says that "missing value where TRUE/FALSE needed". And how to make the larges of bins are the same?

## Answers

In base R

r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1,5, 3023,76336,13,003,9898,1,10,843,10546,617,1375,1,1,5679,1,21,1,13,6,28,1,14088,682) cut.vals <- cut(r, breaks = c(1, 5, 10, 100, 500, Inf), right = FALSE) xy <- data.frame(r, cut = cut.vals) barplot(table(xy$cut))

Note that I added the xy variable to ease in comparing how values were grouped. You can directly put cut.vals into the barplot(table()).

To use ggplot2, you can pre-calculate all the bins and plot

ggplot(xy, aes(x = cut)) + theme_bw() + geom_bar() + scale_x_discrete(drop = FALSE)

geom_histogram's most common parameter that controls bin size is binwidth, which is constant for all bins.