Problems with repetition and grouping

I'm trying to use repetition to trim down input for a sed pattern but I'm getting unexpected results.

The text that I am parsing is structured as:

   \s+\d+\s+\d+\s+\d+\s+\d+\[0-9A-Za-z] ...

I've tried using repetition to reduce the volume of input on one line and make the command simpler to read/debug:


When I try to use this in sed as a substitution command, the value of \2 is always equal to the last word from \1. If I change the repetition from 4 to 5 I can get the alphanumeric pattern into \2 but then it also appears in \1. I need the values in \1 for something else so I don't want to muddle the results or use a work around like removing the last word form the \1 output.

Does anyone have any idea why this is happening or what I am doing wrong?

(I know that awk would be the easiest way to deal with this problem but I am determined to solve this with sed and improve my understanding of regular expressions.)


sed 's/\(\([[:blank:]]\{1,\}[0-9]\{1,\}\)\{4\}\)\([0-9A-Za-z]\)/[\1](\2){\3}/' YourFile
#  \1  +---------------------------------------+ 
#  \2    +------------------------------+
#  \3                                           +-------------+

replacement variable are count front the order of open parenthesis, not the count of it in case of repetition

You can't do that. When you repeat a capturing group, precedant capture is overwritten with the next, this is the reason why your capturing group contains the last match.

