Computing contingency tables in R language

I have a data set in R as follows

ID  Variable1  Variable2 Choice
1   1          2         1
1   2          1         0
2   2          1         1
2   2          1         1

I need to get the output table for it as under

Id Variable1-1 Variable1-2 Variable2-1 Variable2-2
1  1           0           0           1
2  0           2           2           0

Note that only those rows are counted where the choice is 1 (choice is a binary variable, however other variables have any integer values). The aim is to have as many columns for a variable as its levels.

Is there a way I can do this in R?

Answers


You could use melt and dcast from the reshape2 package:

mydf<-read.table(text="ID  Variable1  Variable2 Choice
1   1          2         1
1   2          1         0
2   2          1         1
2   2          1         1",header=TRUE)

library(reshape2)

First melt the data.frame, selecting only those rows where Choice == 1 and removing the Choice column

mydfM <- melt(mydf[mydf$Choice %in% 1, -match("Choice", names(mydf))], id = "ID")

# EDIT above: As @TylerRinker points out, using which could be avoided.
# I've replaced it with %in%

#   ID  variable value
# 1  1 Variable1     1
# 2  2 Variable1     2
# 3  2 Variable1     2
# 4  1 Variable2     2
# 5  2 Variable2     1
# 6  2 Variable2     1

Then cast the melted data.frame, using length as the aggregation function

(mydfC <- dcast(mydfM, ID ~ variable + value, fun.aggregate = length))

#   ID Variable1_1 Variable1_2 Variable2_1 Variable2_2
# 1  1           1           0           0           1
# 2  2           0           2           2           0

It took me a while to figure out what you were after but I got it (I think). I have done what you asked but it's convoluted at best. I think this will help others see what you're after and you'll get better answers now.

dat <- read.table(text="ID  Variable1  Variable2 Choice
1   1          2         1
1   2          1         0
2   2          1         1
2   2          1         1", header=T)


A <- split(dat$Choice, list(dat$Variable1, dat$ID))
B <- split(dat$Choice, list(dat$Variable2, dat$ID))
C <- list(A, B)

FUN <- function(x) sapply(x, function(y) sum(y))

FUN2 <- function(x){
    len <- length(x)/2
    rbind(x[1:len], x[(len+1):length(x)])
}

dat2 <- do.call('data.frame', lapply(lapply(C, FUN), FUN2))
colnames(dat2) <- c('Variable1-1', 'Variable1-2', 'Variable2-1', 
    'Variable2-2')
dat2

This ain't you're grandmother's contingency table that's for sure. Probably there's a much better way to accomplish all of this, maybe with reshape.


Need Your Help

How to use map.osm xml file in osmdroid instead of default online map

android openstreetmap osmdroid

I have received a customized osm file (map.osm) from my customer to integrate Android project, but I don't know how to use that file, usually I'm using

expand.grid function for data.frames in R

r dataframe

I have 2 data.frames with the following columns.