RegEx - Indexed/Arrayed Named Capture Groups?

I have a situation where something can appear in a format as follows:

---id-H--
Header: data
Another Header: more data
Message: sdasdasdasd
Message: asdasdasdasd
Message: asdasdasd

There may be many messages, or just a couple. I'd prefer not having to step outside of RegEx, because I am using the RegEx to parse some header information above the messages and the messages along with the headers are part of the text I am parsing. The messages attached into the text might be many.

I would also like to use named capture groups, so something like

Message: (?<Message[index of match]>.+)

where it matches the match as many times as it can with the index filled in. Does anything like this exist in RegEx? (I will eventually be using this in Perl.)

Answers


Assuming each group is separated by an empty line, this might get you closer:

use strict;
use warnings;

# use two lines as the "line" separator
local $/ = "\n\n";

while (my $line = <DATA>)
{
    my ($id) = ($line =~ /^---id-(\d+)--$/m);
    my @messages = ($line =~ /^Message: (.*)$/mg);

    print "On line $id, found these messages: ", join(', ', @messages), "\n";
}
__DATA__
---id-1--
Header: data
Another Header: more data
Message: sdasdasdasd
Message: asdasdasdasd
Message: asdasdasd

---id-2--
Header: data2
Another Header: stuff
Message: more message
Message: another message
Message: YAM

Running that gives:

On line 1, found these messages: sdasdasdasd, asdasdasdasd, asdasdasd  
On line 2, found these messages: more message, another message, YAM  

The Perl named capture buffer syntax where you have (?<name>...) is really a replacement or alternative use of the Perl syntax of /(pattern1(pattern2))/ with the potential ambiguity of which capture buffer is which.

You could potentially get an hashed form of the match (?<name>pattern) then referring to the special hash values of %+ and %- See perlre for the named capture buffer syntax and perlvar for an example of the %+ and %- and named captures.

There are much simpler solutions in Perl however. You can do a global match which returns a list, then operate on the list. You match all into an array.

Here are samples:

foreach my $message ($text=~/^Message: (.*)/gm) {
   # Process messages...
}

or

my @messages = ($text=~/^Message: (.*)/gm);
print "The first message is $messages[0]\n";

There are many more ways, but those 2 are common and Perly

Best of luck.


Need Your Help

what is the proper install order for visual studio 2012 and SQL Server Management Studio 2012 on win7?

windows-7 visual-studio-2012 sql-server-2012

I am preparing to do some web development against a SQL Server 2012 server on a fresh install of win 7 x64 development VM.

Calling Android's BaseAdapter notifyDataSetChanged() from a listener callback method

android listview baseadapter

I've been struggling with this for some time now. In my project there are several Activities that include a ListView and a custom adapter extending BaseAdapter. They also implement some interfaces ...