# R: duplicates elimination in a matrix, keeping track of multiplicities

I have a basic problem with R. I have produced the matrix

M [,1] [,2] [1,] "a" "1" [2,] "b" "2" [3,] "a" "3" [4,] "c" "1"

I would like to obtain the 3X2 matrix

[,1] [,2] [,3] [1,] "a" "1" "3" [2,] "b" "2" NA [3,] "c" "1" NA

obtained by eliminating duplicates in M[,1] and writing in N[i,2], N[i,3] the values in M[,2] corresponding to the same element in M[,1], for all i's. The "NA"'s in N[,3] correspond to the singletons in M[,1].

I know how to eliminate duplicates from a vector in R: my problem is to keep track of the elements in M[,2] and write them in the resulting matrix N. I tried with for cycles but they do not work so well in my "real world" case, where the matrices are much bigger.

Any suggestions?

I thank you very much.

## Answers

You can use dcast in the reshape2 package after turning your matrix to a data.frame. To reverse the process you can use melt.

df = data.frame(c("a","b","a","c"),c(1:3,1)) colnames(df) = c("factor","obs") require(reshape2) df2=dcast(df, factor ~ obs)

now df2 is:

factor 1 2 3 1 a 1 NA 3 2 b NA 2 NA 3 c 1 NA NA

To me it makes more sense to keep it like this. But if you need it in your format:

res = t(apply(df2,1,function(x) { newLine = as.vector(x[which(!is.na(x))],mode="any"); newLine=c(newLine,rep(NA, ncol(df2)-length(newLine) )) })) res = res[,-ncol(res)] [,1] [,2] [,3] [1,] "a" " 1" " 3" [2,] "b" " 2" NA [3,] "c" " 1" NA