Creating data frames from a list of lists in R

I have a function, readnorm which returns a list of related data from a file identified by an integer:

readnorm <- function(n) {
 a <- read.csv(paste("/tmp/diff-a-", n, ".txt", sep=""), 
               col.names=c("raw"), header=FALSE)
 a <- list(n=n, raw=a$raw, median=median(a$raw), iqr=IQR(a$raw))
 a$shifted <- a$raw - a$median
 a$scaled <- a$raw / a$iqr
 a$normed <- a$shifted / a$iqr
 a$necdf <- ecdf(a$normed)

I can build a list containing data from a set of files by using lapply:

> ns = c(5,6,7,8,9,10,15,20,25,30)
> data <- lapply(ns, readnorm)
> ls(data[[1]])
[1] "iqr"     "median"  "n"       "necdf"   "normed"  "raw"     "scaled" 
[8] "shifted"

Now, what I would like to do is construct from that a set of data frames, called normed, scaled, etc, which group the entries from the components in data (the names could be the values of n if integer names are allowed in R, so normed$5 contains data[[5]]$normed, etc).

Does that make sense? This way I can plot all the raw data by using the raw data frame, for example. It's kind-of turning the data structure I have "inside out".

I am new to R so may be doing something very wrong. In higher-level terms, I believe that the data in the different files are from similar distributions, shifted and scaled, and I want to explore that hypothesis. The code above is my attempt to arrange things so that I can do so in a systematic manner.

So my main question is how to generate the data frames, but I am also interested in more general guidance about how to tackle this problem (how to manage the data - I know about tools like qqplot that will help with the analysis itself).


I agree with the comment that you will be happier using lapply rather than sapply. sapply is doing some simplifying that is actually complicatifying things for you.

More generally, if it were me, I'd do less computation in my function that reads the data, and save the processing for later, once the raw data have been placed in a single structure. For instance:

fun <- function(x){
    read.table(paste("~/Desktop/stackoverflowExamples/raw/raw",x,".txt",sep = ""),
                header = TRUE,sep = ",")

#Just read the raw data and place it all in a data frame
dat <-,lapply(1:2,fun))
#One way to label the columns, if you want to keep track of what came from where
colnames(dat) <- paste("X",1:2,sep = "")

#Now you can shift and scale to your heart's content, much more compactly...
dat_shifted <- scale(dat,center = apply(dat,2,median),scale = FALSE)
dat_normed <- scale(dat,center = apply(dat,2,median),
                        scale = apply(dat,2,IQR))

I'm not sure what you plan on doing with the output of ecdf, so I'll just note that ecdf() returns a function (just in case you didn't realize that).

Finally, see ?make.names for a description of what's allowed for names.

A proof of concept using lapply

A dummy version of read norm

readnorm <- function(n){
 a <- data.frame(raw = 1:10)
 a <- list(n=n, raw = a$raw, normed = runif(10))

use lapply

.list <- lapply(1:5, readnorm)

set the names (manually)

names(.list) <- 1:5

A function to get the data from this list

get_data <- function(.list, .which){
 .x <- data.frame(,lapply(.list, '[[', .which)))
 names(.x) <- names(.list)

get all the data in a named as raw

raw <- get_data(.list, 'raw')

or the same for normed

normed <- get_data(.list, 'normed')


Need Your Help

Is it possible to animate changes to data represented in CorePlot?

ios core-animation core-plot observer-pattern key-value-observing

I'm looking to use CorePlot to show a bunch of data from an API, I'd like to observe changes so the graph can change dynamically. For the best experience, I'm thinking that the data should be anima...