# Product between three data frames based on condition in one

I have three data frames. The examples provided here are simplified and are very different from the original data I am working with.

I have defined three dataframes as follows:

mata <- data.frame(matrix (data = c(1.5,2.1, 3.3, 4.5, 5.1, 6.5), nrow=3, ncol=2, byrow=T)) matb <- data.frame(matrix (data = c(4,5,6,7,8,9), nrow=3, ncol=2, byrow=T)) matc <- data.frame(matrix (data = c(8,6, 9, 7 , 4, 3), nrow = 3, ncol=2, byrow = T))

The data look like following:

> mata X1 X2 1 1.5 2.1 2 3.3 4.5 3 5.1 6.5 > matb X1 X2 1 4 5 2 6 7 3 8 9 > matc X1 X2 1 8 6 2 9 7 3 4 3

Now, I want to calculate the product of mata , matb , and matc depending on the condition used in mata.

I want to first check if the values in mata fall between 0 and 30. Then I want to calculate a new matrix Q(0) , Q(1) .... Q(30) where Q = mata*matb * matc

For each row I want to find Q(0) to Q(30). When I am referring Q(0) then I am looking at all the values greater than 0 and so on.

**My approach:**
I created a logical matrix to check whether the values in mata fall on the specified range.

For example I want to find the values greater than 2 and then find the product.

check1 <- sapply(mata, function(x) x>2) > check1 X1 X2 [1,] FALSE TRUE [2,] TRUE TRUE [3,] TRUE TRUE

The matrix check1 found the exact spots which I am interested in. Now, I want to find the product by row for values greater than 2 in mata. I may eventually need to use rowSums to get only one value but not sure how to implement here.

I used the following code:

> mata[check1] * matb[check1] * matc[check1] [1] 178.2 163.2 63.0 220.5 175.5

What I want is when the value is false, I want to report the product as zero and for the rest I want to calculate using the corresponding values.

**The expected output is as follows when values are greater than 2:**

63 398.7 338.7

What is the efficient way to check for values 0 to 30 at once. I think we could use for loop but I am not sure how to do it. Thanks.

## Answers

Why not simply:

matA <- mata #Copy your mata (so mata won't be changed, just the copy) check1 <- sapply(mata, function(x) x>2) matA[!check1]<-0 #Replace values that do not check with your criterion by 0 rowSums(matA*matb*matc) #Compute [1] 63.0 398.7 338.7

If you want to try multiple thresholds, you can wrap it into a function and apply it to your data:

f <- function(mata,matb,matc,threshold){ matA <- mata check1 <- sapply(mata, function(x) x>threshold) matA[!check1]<-0 rowSums(matA*matb*matc) } sapply(0:30, function(x)f(mata,matb,matc,x)) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30] [,31] [1,] 111.0 111.0 63.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [2,] 398.7 398.7 398.7 398.7 220.5 0.0 0.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [3,] 338.7 338.7 338.7 338.7 338.7 338.7 175.5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

df <- data.frame(cbind(mata,matb,matc)) df2 <- apply(df,1,function(x) { a <- ifelse(x[1] > 2, (x[1]*x[3]*x[5]),0) b <- ifelse(x[2] > 2, (x[2]*x[4]*x[6]),0) return(a+b) })

edit: Using something resembling the real data

df <- data.frame(matrix (data = runif(810000,0,5), nrow = 7500, ncol=108, byrow = T)) df2 <- apply(df,1,function(x) { a <- sapply(seq(1,35,by=2),function(y) { ifelse(x[y] > 2, (x[y]*x[y+36]*x[y+72]),0) }) b <- sapply(seq(2,36,by=2),function(y) { ifelse(x[y] > 2, (x[y]*x[y+36]*x[y+72]),0) }) return(a+b) })