grep something with xargs and find

bash guru ;) I'm trying to improve some string in bash which grep specific keyword's matches in specific files. It looks like that:

find /<path>/hp -iname '*.ppd' -print0 | xargs -0 grep "\*ModelName\:"

which works very fast for me! In 20 times faster than this one:

find /<path>/hp -iname '*.ppd' -print0 | xargs -0 -I {} bash -c 'grep "\*ModelName\:" {}'

But the problem is that in the first script I'm getting the following lines:

/<path>/hp/hp-laserjet_m9040_mfp-ps.ppd:*ModelName: "HP LaserJet M9040 M9050 MFP"

but desired result is just

*ModelName: "HP LaserJet M9040 M9050 MFP"  

(as in the second script). How can I achieve it?

P.S.: I'm using find for flexibility and future improvements of the script.

Answers


The -h option to grep suppress filenames from the output.

find /<path>/hp -iname '*.ppd' -print0 | xargs -0 grep -h "\*ModelName\:"

If your grep does not provide -h the use cat:

find /<path>/hp -iname '*.ppd' -print0 | xargs -0 cat | grep "\*ModelName\:"

Also, for your information, find provides the -exec option which would render xargs unnecessary had you wanted to pursue your second option:

find /<path>/hp -iname '*.ppd' -exec grep grep "\*ModelName\:" '{}' \;

No need for find:

grep -rh --include "*.ppd" "\*ModelName\:"

You can get rid of find altogether (in bash):

shopt -s globstar
grep -h "\*ModelName\:" /<path>/hp/**.[pP][pP][dD]

Might be a bit slower if you have a huge directory tree (which I doubt in your case).

  • Pro: only one process launched!
  • Con: the future improvement you mentioned might be more difficult to implement.

In this case, you'd better use:

find /<path>/hp -iname '*.ppd' -exec grep -h "\*ModelName\:" {} +

(observe the + at the end: only one grep will be launched).


Think of your output line

/<path>/hp/hp-laserjet_m9040_mfp-ps.ppd:*ModelName: "HP LaserJet M9040 M9050 MFP"

as a record of three fields separated by a colon. If you think of your output line this way, then you want to extract the third field as your final answer. If you don't know anything about awk, you should know at least how to print a column of output data using a specific column separator, as I am showing you below:

find /<path>/hp -iname '*.ppd' -print0 | xargs -0 grep "\*ModelName\:" | awk -F:'{ print $3}'

The other thing you should know about awk is how to sum up (and occasionally, take the average) of the numbers in a specific column of output data, but that's another story for another day :)

The advantage of appending the awk command to your command chain is that the you are building on and taking advantage of the fast performance of your optimized command chain :)

In your case, the answer is grep with xargs and find and awk :)


Need Your Help

what to use libharu c++ or i text java

java c++ pdf itext libharu

i need to create pdf creation server and i don't know what is the best tools to chose

Bash, one directory, run script from a script, extraordinary error

bash shell

I'm trying to test my script by running it inside another script. I googled up some solutions and implemented them (btw - both scripts are in one directory) but I keep getting a following error.