How to loop through data sets to graph particular columns only?

The graph, I have down. The challenge is I have the _exact_same_code_ for graphing multiple data sets (rather, subsets of one LARGE data set), but I can't seem to get the looping code right to substitute the $ correctly.

Data sets, df1, df2, df3... of the form:

OBSDATE     REGION  AVG_RESP  P10  P90
2012-02-01  APAC    1.276     0.78 3.45
2012-02-01  EMEA    2.341     1.23 5.67
2012-02-02  APAC    1.343     0.89 3.21
2012-02-02  EMEA    2.473     1.37 5.98

The graph is more complex, but like this:

avgMx <- quantile(df1$P90,0.95)
ggplot(df1,aes(x=OBSDATE,y=AVG_RESP))+coord_cartesian(ylim=c(0,avgMx))+geom_ribbon(aes(ymin=P10,ymax=P90),fill="gray60",alpha=0.33)+geom_line(aes(x=OBSDATE,y=AVG_RESP),color="#007DB1",size=0.5)+facet_wrap(~REGION)

if I define a vector or list (both seem to fail with the same error messages) with the data set names I can't get the loop to work to find any descriptive values (like the quantile above or even a max!)

filenames <- c("df1","df2","df3")

I would like to get something like this to work

for (i in filenames) {
   quantile(i$AVG_RESP,0.95)
   max(i$AVG_RESP)
}

But I get errors about $ is invalid for atomic vectors. Upon investigation, that doesn't seem to yield any usable results.

So, I can get this to work:

max(df1$AVG_RESP) or max(df1['AVG_RESP'])

they both would output 2.473 from above. However, this doesn't fly:

for (i in pagesC) max(i['AVG_RESP'])

It does nothing. Changing it to this:

for (i in pagesC) print(max(i['AVG_RESP']))

Gives instances of NA.

I'm completely stuck. Any help would be tremendously appreciated!

EDIT: I fixed the data that was causing errors - should be reproducible now.

Answers


i is a character string; you want the object which has the name that is held in i. That is the get() function. (untested since what you gave was not reproducible.)

for (filename in filenames) {
   i <- get(filename)
   quantile(i$AVG_RESP,0.95)
   max(i$AVG_RESP)
}

This is probably not the best way to solve your problem, though. Putting all the data frames in a list and looping over that list with lapply might be a better approach (what Tyler described in his answer). Furthermore, if these are subsets you've made of a bigger, single data frame you have, then an even better approach would be to use something from the plyr package to define how to split the big data frame up and what to do with each part.


Your code isn't reproducible so this is my best guess at what you want:

df1 <- df2 <- df3 <- read.table(text="OBSDATE     REGION  AVG_RESP  P10  P90
2012-02-01  APAC    1.276     0.78 3.45
2012-02-01  EMEA    2.341     1.23 5.67
2012-02-02  APAC    1.343     0.89 3.21
2012-02-02  EMEA    2.473     1.37 5.98
2012-02-01  APAC    1.276     0.78 3.45
2012-02-01  EMEA    2.341     1.23 5.67
2012-02-02  APAC    1.343     0.89 3.21
2012-02-02  EMEA    2.473     1.37 5.98
2012-02-01  APAC    1.276     0.78 3.45
2012-02-01  EMEA    2.341     1.23 5.67
2012-02-02  APAC    1.343     0.89 3.21
2012-02-02  EMEA    2.473     1.37 5.98", header=TRUE)

info <- function(dataframe){
    c(quantile(dataframe$AVG_RESP,0.95), max(dataframe$AVG_RESP))
}

LIST <- list(df1, df2, df3)
lapply(LIST, info)   
#Or you may want to use sapply if you want it to return a matrix
sapply(LIST, info) 

R can use loops but this really isn't the R way of doing things.


Need Your Help

Using findAll Collection Closure in Groovy

groovy iteration

I have a "Set" that I need to use the findAll closure upon. The Set contains objects, not just primitive values. For instance... I have a Set of Employee objects and I need to iterate and grab elem...

Forcing CodeIgniter to send view and stop working

php codeigniter constructor preventdefault

Hello I'm using inherited controllers. These are my controllers: