While-loop subshell dilemma in Bash

i want to compute all *bin files inside a given directory. Initially I was working with a for-loop:

var=0
for i in *ls *bin
do
   perform computations on $i ....
   var+=1
done
echo $var

However, in some directories there are too many files resulting in an error: Argument list too long

Therefore, I was trying it with a piped while-loop:

var=0
ls *.bin | while read i;
do
  perform computations on $i
  var+=1
done
echo $var

The problem now is by using the pipe subshells are created. Thus, echo $var returns 0. How can I deal with this problem? The original Code:

#!/bin/bash

function entropyImpl {
    if [[ -n "$1" ]]
    then
        if [[ -e "$1" ]]
        then
            echo "scale = 4; $(gzip -c ${1} | wc -c) / $(cat ${1} | wc -c)" | bc
        else
            echo "file ($1) not found"
        fi
    else
        datafile="$(mktemp entropy.XXXXX)"
        cat - > "$datafile"
        entropy "$datafile"
        rm "$datafile"
    fi

    return 1
}
declare acc_entropy=0
declare count=0

ls *.bin | while read i ;
do  
    echo "Computing $i"  | tee -a entropy.txt
    curr_entropy=`entropyImpl $i`
    curr_entropy=`echo $curr_entropy | bc`  
    echo -e "\tEntropy: $curr_entropy"  | tee -a entropy.txt
    acc_entropy=`echo $acc_entropy + $curr_entropy | bc`
    let count+=1
done

echo "Out of function: $count | $acc_entropy"
acc_entropy=`echo "scale=4; $acc_entropy / $count" | bc`

echo -e "===================================================\n" | tee -a entropy.txt
echo -e "Accumulated Entropy:\t$acc_entropy ($count files processed)\n" | tee -a entropy.txt

Answers


The problem is that the while loop is executed in a subshell. After the while loop terminates, the subshell's copy of var is discarded, and the original var of the parent (whose value is unchanged) is echoed.

One way to fix this is by using Process Substitution as shown below:

var=0
while read i;
do
  # perform computations on $i
  ((var++))
done < <(find . -type f -name "*.bin" -maxdepth 1)

Take a look at BashFAQ/024 for other workarounds.

Notice that I have also replaced ls with find because it is not good practice to parse ls.


A POSIX compliant solution would be to use a pipe (p file). This solution is very nice, portable, and POSIX, but writes something on the hard disk.

mkfifo mypipe
find . -type f -name "*.bin" -maxdepth 1 > mypipe &
while read line
do
    # action
done < mypipe
rm mypipe

Your pipe is a file on your hard disk. If you want to avoid having useless files, do not forget to remove it.


So researching the generic issue, passing variables from a sub-shelled while loop to the parent. One solution I found, missing here, was to use a here-string. As that was bash-ish, and I preferred a POSIX solution, I found that a here-string is really just a shortcut for a here-document. With that knowledge at hand, I came up with the following, avoiding the subshell; thus allowing variables to be set in the loop.

#!/bin/sh

set -eu

passwd="username,password,uid,gid
root,admin,0,0
john,appleseed,1,1
jane,doe,2,2"

main()
{
    while IFS="," read -r _user _pass _uid _gid; do
        if [ "${_user}" = "${1:-}" ]; then
            password="${_pass}"
        fi
    done <<-EOT
        ${passwd}
    EOT

    if [ -z "${password:-}" ]; then
        echo "No password found."
        exit 1
    fi

    echo "The password is '${password}'."
}

main "${@}"

exit 0

One important note to all copy pasters, is that the here-document is setup using the hyphen, indicating that tabs are to be ignored. This is needed to keep the layout somewhat nice. It is important to note, because stackoverflow doesn't render tabs in 'code' and replaces them with spaces. Grmbl. SO, don't mangle my code, just cause you guys favor spaces over tabs, it's irrelevant in this case!

This probably breaks on different editor(settings) and what not. So the alternative would be to have it as:

    done <<-EOT
${passwd}
EOT

This could be done with a for loop, too:

var=0;
for file in `find . -type f -name "*.bin" -maxdepth 1`; do 
    # perform computations on "$i"
    ((var++))
done 
echo $var

Need Your Help

How to implement "Cross Join" in Spark?

apache-spark cross-join

We plan to move Apache Pig code to the new Spark platform.

function pointer without typedef

c++ c function-pointers typedef syntactic-sugar

Is it posible to use the type of a prefiously declared function as a function pointer without using a typedef?