# Efficient way of finding max in 1d array using pthreads in C

I want to find the maximum value in 1d array using **pthreads** in C.

I have a code like this:

void* findmax(void* arg){ double temp_max; astruct *td=(astruct *)arg; P[td->idx]=d*P[td->idx]+sth; temp_max= fabs(P[td->idx]-P_old[td->idx]); pthread_mutex_lock(&lockP); if(max<temp_max){ max=temp_max; } pthread_mutex_unlock(&lockP); } main(){ ... //give to each thread an element of P TD[i].idx; .... for(i=0;i<thread_number;i++){ pthread_create(&threads[i],NULL,&findmax,(void*)&TD[i]); } ... /* when the above threads are done give them new element and start the loop again till the end of array P */ }

So the problem is that the mutexes are necessary to find the correct result, but they slow the program so much that in the end **the serial code is faster than this implementation**.

Is there any efficient way of solving this problem using pthreads that is faster than the serial simple code of finding max?

## Answers

Use the divide and conquer approach.

- Start with an array A of length N. A(i) is the i'th element of the array A. A(i,j) is the subset of A that is the i'th through (j-1)'th elements.
- Divide the number N by the number of cores available,C, on the machine. This number is W, the workload size
- Define a function maxOnSubset(i,j) that returns a value that is the same type as the elements of A. The function finds the maximum value in A(i,j). If j is greater than the length of A, the function sets j to the length of A.
- Start C threads numbered [0,C). The number associated with each thread is c. Each thread is responsible for calling the function maxOnSubset(c*W,(c+1)*W) and storing the value. You can use a semaphore to know when each thread has computed this value. This allows each thread to progress independently of any other thread.
- Wait for each thread to complete and collect the stored values in a second array B. The array B has a length of C.
- Find the maximum value in B. The maximum value of B is also the maximum value of A.