race condition in the common lock on file?
this is the standard approach to create locks using file system. For example, visudo uses it:
[ -f ".lock" ] && exit 1 touch .lock # do something rm .lock
1) I'm confused, for there's a race condition, yet Linux uses it
2) is there a better way to lock on files from shell?
3) or do I have to use directories instead?
Found solution: man lockfile.
Yes, there is indeed a race condition in the sample script. You can use bash's noclobber option in order to get a failure in case of a race, when a different script sneaks in between the test and the touch.
It's described here. I've excerpted the critical piece, with a couple of annotations (prefixed by BK:):
A possible solution to this is to use IO redirection and bash's noclobber mode, which won't redirect to an existing file. We can use something similar to:
if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null; then # BK: this will cause the lock file to be deleted in case of other exit trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT # critical-section BK: (the protected bit) rm -f "$lockfile" trap - INT TERM EXIT else echo "Failed to acquire lockfile: $lockfile." echo "Held by $(cat $lockfile)" fi
Try flock command:
exec 200>"$LOCK_FILE" flock -e -n 200 || exit 1
It will exit if the lock file is locked. It is atomic and it will work over recent version of NFS.
I did a test. I have created a counter file with 0 in it and executed the following in a loop on two servers simultaneously 500 times:
#!/bin/bash exec 200>/nfs/mount/testlock flock -e 200 NO=`cat /nfs/mount/counter` echo "$NO" let NO=NO+1 echo "$NO" > /nfs/mount/counter
One node was fighting with the other for the lock. When both runs finished the file content was 1000. I have tried multiple times and it always works!
Note: NFS client is RHEL 5.2 and server used is NetApp.
Lock your script (against parallel run)
seems like I've found an easier solution: man lockfile