# Histogram with Logarithmic Scale and custom breaks

I'm trying to generate a histogram in R with a logarithmic scale for y. Currently I do:

```hist(mydata\$V3, breaks=c(0,1,2,3,4,5,25))
```

This gives me a histogram, but the density between 0 to 1 is so great (about a million values difference) that you can barely make out any of the other bars.

Then I've tried doing:

```mydata_hist <- hist(mydata\$V3, breaks=c(0,1,2,3,4,5,25), plot=FALSE)
plot(rpd_hist\$counts, log="xy", pch=20, col="blue")
```

It gives me sorta what I want, but the bottom shows me the values 1-6 rather than 0, 1, 2, 3, 4, 5, 25. It's also showing the data as points rather than bars. barplot works but then I don't get any bottom axis.

A histogram is a poor-man's density estimate. Note that in your call to hist() using default arguments, you get frequencies not probabilities -- add ,prob=TRUE to the call if you want probabilities.

As for the log axis problem, don't use 'x' if you do not want the x-axis transformed:

```plot(mydata_hist\$count, log="y", type='h', lwd=10, lend=2)
```

gets you bars on a log-y scale -- the look-and-feel is still a little different but can probably be tweaked.

Lastly, you can also do hist(log(x), ...) to get a histogram of the log of your data.

Another option would be to use the package ggplot2.

```ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10()
```

It's not entirely clear from your question whether you want a logged x-axis or a logged y-axis. A logged y-axis is not a good idea when using bars because they are anchored at zero, which becomes negative infinity when logged. You can work around this problem by using a frequency polygon or density plot.

Dirk's answer is a great one. If you want an appearance like what hist produces, you can also try this:

```buckets <- c(0,1,2,3,4,5,25)
mydata_hist <- hist(mydata\$V3, breaks=buckets, plot=FALSE)
bp <- barplot(mydata_hist\$count, log="y", col="white", names.arg=buckets)
text(bp, mydata_hist\$counts, labels=mydata_hist\$counts, pos=1)
```

The last line is optional, it adds value labels just under the top of each bar. This can be useful for log scale graphs, but can also be omitted.

I also pass main, xlab, and ylab parameters to provide a plot title, x-axis label, and y-axis label.

Run the hist() function without making a graph, log-transform the counts, and then draw the figure.

```hist.data = hist(my.data, plot=F)
hist.data\$counts = log(hist.data\$counts, 2)
plot(hist.data)
```

It should look just like the regular histogram, but the y-axis will be log2 Frequency.

I've put together a function that behaves identically to hist in the default case, but accepts the log argument. It uses several tricks from other posters, but adds a few of its own. hist(x) and myhist(x) look identical.

The original problem would be solved with:

```myhist(mydata\$V3, breaks=c(0,1,2,3,4,5,25), log="xy")
```

The function:

```myhist <- function(x, ..., breaks="Sturges",
main = paste("Histogram of", xname),
xlab = xname,
ylab = "Frequency") {
xname = paste(deparse(substitute(x), 500), collapse="\n")
h = hist(x, breaks=breaks, plot=FALSE)
plot(h\$breaks, c(NA,h\$counts), type='S', main=main,
xlab=xlab, ylab=ylab, axes=FALSE, ...)
axis(1)
axis(2)
lines(h\$breaks, c(h\$counts,NA), type='s')
lines(h\$breaks, c(NA,h\$counts), type='h')
lines(h\$breaks, c(h\$counts,NA), type='h')
lines(h\$breaks, rep(0,length(h\$breaks)), type='S')
invisible(h)
}
```

Exercise for the reader: Unfortunately, not everything that works with hist works with myhist as it stands. That should be fixable with a bit more effort, though.

Here's a pretty ggplot2 solution:

```library(ggplot2)
library(scales)  # makes pretty labels on the x-axis

breaks=c(0,1,2,3,4,5,25)

ggplot(mydata,aes(x = V3)) +
geom_histogram(breaks = log10(breaks)) +
scale_x_log10(
breaks = breaks,
labels = scales::trans_format("log10", scales::math_format(10^.x))
)
```

Note that to set the breaks in geom_histogram, they had to be transformed to work with scale_x_log10