Reading, recoding, subsetting, and reshaping sequentially labeled data.frames in R

I'm having a tough time adapting a script that I've previously used to read in and recode sequentially labeled data.tables.

I have a series of data.tables in R that are sequentially labeled: df1,df2,df3, etc. There are then specific (and inconsistent) rules that I apply to create new variables in those data.tables called status and csat.

What I would like to do is:

  1. Read in the data tables
  2. Recode the csat variable into a new variable
  3. Subset the data.table so it only includes 4 variables (csat,csat_d,id,and status)
  4. Merge the data.table with previous tables using an outer join (so it can be reshaped into long form)

I am trying to address points 1-3 in the script below, and have no idea how to implement #4.

EDITED:

df_names<-c(df,df2,df3)  # Create list of data.tables
csat_vars<-c("CustomerId","csat","csat_d","status") # Create list of 4 variables

out <- lapply(1:length(df_names), function(idx) {
  d <- df_names[idx]
  d$csat_d <- recode(d$csat,"1:5=0;6:7=1;NA=NA;")
  d <- subset(d, select=c(csat_vars))
})

I am agnostic about whether or not it's better to use data.table or data.frame (these are small datasets), so any help is welcome.

Mini-datasets here:

> dput(head(df))
structure(list(respid = c(1499L, 433L, 2600L, 2282L, 1503L, 3304L
), csat = c(4L, 6L, NA, NA, 6L, 4L), status = c("Active", "Active", 
"Active", "Active", "Active", "Active"), touch = c(2L, 3L, 2L, 
3L, 2L, 2L)), .Names = c("CustomerId", "csat", "status", "touch"), class = c("data.table", 
"data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer: 0x7f800301b778>)

> dput(head(df2_r))
structure(list(respid = c(6L, 5L, 149L, 147L, 270L, 145L), csat = c(4L, 
NA, 6L, 7L, 7L, 4L), status = c("Active", "Lapsed/Churned", "Active", 
"Active", "Active", "Active"), touch = c(3L, NA, 3L, 1L, 3L, 
1L)), .Names = c("CustomerId", "csat", "status", "touch"), class = c("data.table", 
"data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer: 0x7f800301b778>)

> dput(head(df3))
structure(list(respid = c(1713L, 1611L, 1630L, 1773L, 1391L, 
1571L), csat = c(4L, 6L, 4L, 5L, 7L, 4L), status = c("Active", 
"Active", "Active", "Active", "Active", "Active"), AGENCY_1 = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
)), .Names = c("CustomerId", "csat", "status", "AGENCY_1"), class = c("data.table", 
"data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer: 0x7f800301b778>)

Answers


At a guess I'd say you want to do this...

out <- lapply( ll , function(x) x[ , csat := recode( csat , ,"1:5=0;6:7=1;NA=NA;" ) ][ , csat_vars , with = FALSE ] )

And as a toy worked example I show this:

df1 <- data.table( a = 1 , b = 2 , c = 3 )
df2 <- data.table( a = 1 , b = 2 , c = 3 )
ll <- list(df1,df2) 
vars <- c( "a" , "c" )
#  Recode column 'c' to 10, and then subset data.table to only columns 'a' and 'c'
lapply( ll , function(x)  x[ , c := 10 ][ , vars , with = FALSE  ] )
#[[1]]
#   a  c
#1: 1 10

#[[2]]
#   a  c
#1: 1 10

Need Your Help

Change color of one textview on listview without change others

java android listview colors android-arrayadapter

I need to change the color of my listview backgroundcolor property. I can do it, I change the color, but it changes all my rows with the same color. I mean, I need one row with red color, other with

Is there a 'screen pinning' or 'kiosk mode' available for the Google Tango?

android android-4.2-jelly-bean google-project-tango kiosk-mode android-screen-pinning

I am trying to activate "screen pinning" to lock access to a single app on the Google Tango, for public display purposes. I would like to do this WITHOUT an app from the Google Play store. Is there...