Group the variable according to its value and get a histogram

I'm trying to group the variable according to its values and get a histogram.

For example, this is my data:

r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1,

I want to group r by its value, like: 1-5, 5-10, 10-100, 100-500 and more than 500. And then I want to get a histogram which the x axis is in the type of interval (1-5,5-10,10-100,100-500 and more than 500) . How to solve that?

If I want to use le package ggplot2, code as following:

ggplot(data=r, aes(x=r))+geom_histogram(breaks = c(1, 5, 10, 100, 500,2000,Inf))

It dosen't work and R says that "missing value where TRUE/FALSE needed". And how to make the larges of bins are the same?


In base R

r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1,5,
cut.vals <- cut(r, breaks = c(1, 5, 10, 100, 500, Inf), right = FALSE)
xy <- data.frame(r, cut = cut.vals)

Note that I added the xy variable to ease in comparing how values were grouped. You can directly put cut.vals into the barplot(table()).

To use ggplot2, you can pre-calculate all the bins and plot

ggplot(xy, aes(x = cut)) +
  theme_bw() +
  geom_bar() +
  scale_x_discrete(drop = FALSE)

geom_histogram's most common parameter that controls bin size is binwidth, which is constant for all bins.

Need Your Help

Why doesn't Python 2.6 have set literals and comprehensions or dict comprehensions?

python python-3.x

Python 2.6 was basically a stepping stone to make converting to Python 3 easier. A lot of the features destined for Python 3 were implemented in 2.6 if they didn't break backward compatibility with

Annotate Time Series plot in Matplotlib

python numpy matplotlib

I have an index array (x) of dates (datetime objects) and an array of actual values (y: bond prices). Doing (in iPython):