Cat unix command multithread implementation

Hi im trying to implement faster cat than the one provided.

My current implementation looks like this:

#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#define BUF_SIZE 1024*1024*1024

char buffer[BUF_SIZE];
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond_var = PTHREAD_COND_INITIALIZER;
pthread_cond_t cond_var2 = PTHREAD_COND_INITIALIZER;
int readed = 0;
/*
    Read characters from standard input and saves them to buffer
*/
void *consumer(void *data) {
    int r;
    while(1) {
        //---------CRITICAL CODE--------------
        //------------REGION------------------
        pthread_mutex_lock(&mutex);
        if (readed > 0)
        {
            pthread_cond_wait(&cond_var2, &mutex);
        }
        r = read(0, buffer, BUF_SIZE);
        readed = r;

        pthread_cond_signal(&cond_var);
        pthread_mutex_unlock(&mutex);
        //------------------------------------

        if (r == -1){
            printf("Error reading\n");
        }  
        else if (r == 0) {
            pthread_exit(NULL);
        }
    }
}

/*
    Print chars readed by consumer from standard input to standard output
*/
void *out_producer(void *data) {
    int w;
    while(1){    
        //---------CRITICAL CODE--------------
        //-------------REGION-----------------
        pthread_mutex_lock(&mutex);
        if (readed == 0)
        {
            pthread_cond_wait(&cond_var, &mutex);
        }
        w = write(1, buffer, readed); 
        readed = 0;
        pthread_cond_signal(&cond_var2);
        pthread_mutex_unlock(&mutex);
        //------------------------------------ 

        if (w == -1){
            printf("Error writing\n");
        } 
        else if (w == 0) {
            pthread_exit(NULL);
        }
    }
}

What would you suggest to make it faster? Any ideas? I was thinking about the BUF_SIZE, what would you think would be optimal size of buffer?

Main just makes the threads:

int main() {
    //  Program RETURN value
    int return_value = 0;

    //  in - INPUT thread
    //  out - OUTPUT thread
    pthread_t in, out;

    //  Creating in thread - should read from standard input (0)
    return_value = pthread_create(&in , NULL, consumer, NULL);
    if (return_value != 0) {
        printf("Error creating input thread exiting with code error: %d\n", return_value);
        return return_value;
    }

    //  Creating out thread - should write to standard output (1)
    return_value = pthread_create(&out, NULL, out_producer, NULL);
    if (return_value != 0) {
        printf("Error creating output thread exiting with code error: %d\n", return_value);
        return return_value;
    }

    return_value = pthread_join(in, NULL);
    return_value = pthread_join(out, NULL);

    return return_value;
}

Answers


How exactly is adding threads to cat going to make it faster? You can't just throw parallelism at any program and expect it to run faster.

Cat basically just transports every line of input (usually from a file) to output. Since it's important that the lines are in order, you have to use mutual exclusion to avoid racing.

The upper bound of the speed (the fastest that cat can run) in parallel cannot be higher than cat in serial, since every thread must perform the serial actions, along with the cost of synchronization.


Need Your Help

Statically linked binary requires shared library libnss

android arm glibc codesourcery linaro

I was cross compiling for android using linaro and codesourcery toolchains i found even after providing -static here problem seems to come from glibc dynamically link libnss_* libraries.

Which is the best way to bi-directionally synchronize dynamic data in real time using mysql

php mysql database synchronization data-synchronization

Here is the scenario. 2 web servers in two separate locations having two mysql databases with identical tables. The data within the tables is also expected to be identical in real time.