# Turning a frequency matrix into a binary matrix in r dependent on binomial tests

I have a matrix such as this example where a1, a2, a3, a4 and a5 refer to individuals competing against each other. Rows of the matrix represent 'wins' against the same individuals in the columns.

So in the example below, individual a2 beat a4 12 times, whereas a4 beat a2 13 times, meaning that they had a total of 25 contests.

In this example, the diagonals are all 0, but they could easily be NA because it is impossible for each individual to compete with themselves.

The underneath enables you to create the dataframe/matrix:

a1<-c(0,13,3,33,0) a2<-c(1,0,22,13,1) a3<-c(1,0,0,2,2) a4<-c(1,12,22,0,12) a5<-c(3,1,0,0,0) df<-as.data.frame(cbind(a1,a2,a3,a4,a5)) rownames(df)<-c("a1","a2","a3","a4","a5") df m<-as.matrix(df) m

The matrix looks like this:

a1 a2 a3 a4 a5 a1 0 1 1 1 3 a2 13 0 0 12 1 a3 3 22 0 22 0 a4 33 13 2 0 0 a5 0 1 2 12 0

What I want to do is to turn this frequency matrix into a binary matrix. I want to enter a 1 into the row of each individual if they have significantly more wins than expected by chance against an individual in a particular column according to a binomial test testing against a p=0.5

Therefore for pair a2 versus a4, you would run the binom.test like this

binom.test(c(12,25), 0.5))

which says that this is not significant. Therefore in the cell for row a2, column a4 we would enter a 0. We also enter a 0 in the row a4, column a2.

However, a4 beats a1 33 times out of 34, whereas a1 beats a4 1 time out of 34. Running the binomial test for this:

binom.test(c(33,34), 0.5))

This is obviously significant, and therefore row a4 column a1 should get a '1', but row a1 column a4 gets a '0'.

The resulting matrix should look like this:

a1 a2 a3 a4 a5 a1 0 0 0 0 0 a2 1 0 0 0 0 a3 0 1 0 1 0 a4 1 0 0 0 0 a5 0 0 0 1 0

I've been trying a number of approaches to this, but all have failed thus far.

Any ideas appreciated and welcomed.

## Answers

I admit, I was about to lambaste you for doing it "all wrong," then I re-read the page and how you were doing it and re-learned binom.test. You have one problem in your question in that you're missing a comma, but I'm guessing that this is just a problem typing it into SO.

SIDE POINT: please copy/paste working code. It takes much more time trying to infer what you meant when the code as depicted won't even run much less give the desired output.

However, you are still calling it wrong. From ?binom.test, if you define x as a vector of two values then it must be the "number of success and failures", not (as it appears you have done) the "number of successes and trials." Either do:

binom.test(12, 12+13, 0.5)

or

binom.test(c(12, 13), 0.5)

Second, there's nothing here to convince me how you've attempted to
automate. You say that "*row a4 column a1 should get a '1', but a1
column a4 gets a '0'*", but I have no clue what code you used to get
there. If you want help with the code you've tried, please include it,
even if it isn't elegant. The best way to learn efficient and elegant
coding practices is to take what you've generated and tweak it in
places.

To some code. Try this:

# define the function func <- function(mtx, p=0.5, alpha=0.05) { # preallocate the matrix in memory m2 <- mtx for (rr in 2:nrow(mtx)) { for (cc in 1:(rr-1)) { # these two `for` loops work on the non-diag lower triangle x <- mtx[rr,cc] y <- mtx[cc,rr] sig <- (binom.test(x, x+y, p)$p.value <= alpha) # lower-triangle entry m2[rr,cc] <- 1*((x>y) & sig) # opposing element in the upper-triangle m2[cc,rr] <- 1*((y>x) & sig) } } m2 } # requisite variables a1 <- c(0,13,3,33,0) a2 <- c(1,0,22,13,1) a3 <- c(1,0,0,2,2) a4 <- c(1,12,22,0,12) a5 <- c(3,1,0,0,0) # merge them sequentially into a matrix m <- matrix(c(a1, a2, a3, a4, a5), byrow=FALSE, nrow=5, dimnames=list(paste0('a', 1:5), paste0('a', 1:5))) func(m) # a1 a2 a3 a4 a5 # a1 0 0 0 0 0 # a2 1 0 0 0 0 # a3 0 1 0 1 0 # a4 1 0 0 0 0 # a5 0 0 0 1 0

Some notes:

Looping through the lower-triangle slightly more efficient, though it's not

*wrong*to do 1:nrow(m) on both rr and cc. You could check for rr == cc in the code (if binom.test were computational expensive, for instance), but in this example it won't cost you much at all. However, if/when you use tests that take longer to calculate, you will want to save a second or two here and there in your code.The 1*(...) coerces a boolean into a 0 or 1. I could also have done as.integer(...) with the same effect.

The (x>y) ensures the binom.test results of "significant" is only given for winners, since binom.test(0, 100, 0.5) is still very significant (albeit a loser).

Hope this helps.

*Edit*: removed the double-test of binom.test because (as @rawr correctly pointed out) it was redundant; and was incorrectly accessing the m variable directly from inside the function instead of its internal mtx.