ggplot2 heatmaps: using different gradients for categories

This Learning R blog post shows how to make a heatmap of basketball stats using ggplot2. The finished heatmap looks like this:

My question (inspired by Jake who commented on the Learning R blog post) is: would it be possible to use different gradient colors for different categories of stats (offensive, defensive, other)?

Answers


First, recreate the graph from the post, updating it for the newer (0.9.2.1) version of ggplot2 which has a different theme system and attaches fewer packages:

nba <- read.csv("http://datasets.flowingdata.com/ppg2008.csv")
nba$Name <- with(nba, reorder(Name, PTS))

library("ggplot2")
library("plyr")
library("reshape2")
library("scales")

nba.m <- melt(nba)
nba.s <- ddply(nba.m, .(variable), transform,
               rescale = scale(value))

ggplot(nba.s, aes(variable, Name)) + 
  geom_tile(aes(fill = rescale), colour = "white") + 
  scale_fill_gradient(low = "white", high = "steelblue") + 
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank(), 
        axis.text.x = element_text(angle = 330, hjust = 0))

Using different gradient colors for different categories is not all that straightforward. The conceptual approach, to map the fill to interaction(rescale, Category) (where Category is Offensive/Defensive/Other; see below) doesn't work because interacting a factor and continuous variable gives a discrete variable which fill can not be mapped to.

The way to get around this is to artificially do this interaction, mapping rescale to non-overlapping ranges for different values of Category and then use scale_fill_gradientn to map each of these regions to different color gradients.

First create the categories. I think these map to those in the comment, but I'm not sure; changing which variable is in which category is easy.

nba.s$Category <- nba.s$variable
levels(nba.s$Category) <- 
  list("Offensive" = c("PTS", "FGM", "FGA", "X3PM", "X3PA", "AST"),
       "Defensive" = c("DRB", "ORB", "STL"),
       "Other" = c("G", "MIN", "FGP", "FTM", "FTA", "FTP", "X3PP", 
                   "TRB", "BLK", "TO", "PF"))

Since rescale is within a few (3 or 4) of 0, the different categories can be offset by a hundred to keep them separate. At the same time, determine where the endpoints of each color gradient should be, in terms of both rescaled values and colors.

nba.s$rescaleoffset <- nba.s$rescale + 100*(as.numeric(nba.s$Category)-1)
scalerange <- range(nba.s$rescale)
gradientends <- scalerange + rep(c(0,100,200), each=2)
colorends <- c("white", "red", "white", "green", "white", "blue")

Now replace the fill variable with rescaleoffset and change the fill scale to use scale_fill_gradientn (remembering to rescale the values):

ggplot(nba.s, aes(variable, Name)) + 
  geom_tile(aes(fill = rescaleoffset), colour = "white") + 
  scale_fill_gradientn(colours = colorends, values = rescale(gradientends)) + 
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank(), 
        axis.text.x = element_text(angle = 330, hjust = 0))

Reordering to get related stats together is another application of the reorder function on the various variables:

nba.s$variable2 <- reorder(nba.s$variable, as.numeric(nba.s$Category))

ggplot(nba.s, aes(variable2, Name)) + 
  geom_tile(aes(fill = rescaleoffset), colour = "white") + 
  scale_fill_gradientn(colours = colorends, values = rescale(gradientends)) + 
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank(), 
        axis.text.x = element_text(angle = 330, hjust = 0))


Here's a simpler suggestion that uses ggplot2 aesthetics to map both gradients as well as color categories. Simply use an alpha-aesthetic to generate the gradient, and the fill-aesthetic for the category.

Here is the code to do so, refactoring Brian Diggs' response:

nba <- read.csv("http://datasets.flowingdata.com/ppg2008.csv")
nba$Name <- with(nba, reorder(Name, PTS))

library("ggplot2")
library("plyr")
library("reshape2")
library("scales")

nba.m <- melt(nba)
nba.s <- ddply(nba.m, .(variable), transform,
           rescale = scale(value))

nba.s$Category <- nba.s$variable
levels(nba.s$Category) <- list("Offensive" = c("PTS", "FGM", "FGA", "X3PM", "X3PA", "AST"),
   "Defensive" = c("DRB", "ORB", "STL"),
   "Other" = c("G", "MIN", "FGP", "FTM", "FTA", "FTP", "X3PP", "TRB", "BLK", "TO", "PF"))

Then, normalise the rescale variable to between 0 and 1:

nba.s$rescale = (nba.s$rescale-min(nba.s$rescale))/(max(nba.s$rescale)-min(nba.s$rescale))

And now, do the plotting:

ggplot(nba.s, aes(variable, Name)) + 
  geom_tile(aes(alpha = rescale, fill=Category), colour = "white") + 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank(), 
        axis.text.x = element_text(angle = 330, hjust = 0)) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())

Note the use of alpha=rescale and then the scaling of the alpha range using scale_alpha(range=c(0,1)), which can be adapted to change the range appropriately for your plot.


Need Your Help

Laravel Production issue - Updating composer with Laravel 4.1.x

php laravel composer-php

I haven't encountered any issue in deploying my laravel project until now. I've been deploying for like almost a year already for this project. But some new bug came up.

Extracting text from a contentEditable div

javascript jquery html css contenteditable

I have a div set to contentEditable and styled with "white-space:pre" so it keeps things like linebreaks. In Safari, FF and IE, the div pretty much looks and works the same. All is well. What I wan...