Regex - orderless extraction of string

I have 2 strings which are 2 records

string1 = "abc/BS-QANTAS\\/DS-12JUL15\\dfd"
string2 = "/DS-10JUN15\\/BS-AIRFRANCE\\dfdsfsdf"

BS is booking airline DS is Date

I want to use a single regex and extract the booking source & date. Please let me know if it is feasible. I have tried lookaheads and still couldn't achieve

The target language is Splunk and not Javascript. Whatever may be the language please post I'll give a try in Splunk

Answers


You mentioned that you've tried lookahead, what about lookbehind?

(?<=BS-|DS-)(\w+)

Tested at Regex101


Here's a more scalable (and more readable, IMO) alternative to miroxlav's answer:

(?:\/BS-(?P<source>\w+)|\/DS-(?P<date>\w+)|[^\/\v]+)+

I'm assuming the fields you're interested in always start with a slash. That allows me to use [^/]+ to safely consume the junk between/around them.

demo

This is effectively three regexes in one, wrapped in a group, to give each one a chance to match in turn, and applied multiple times. If the first alternative matches, you're looking at a "source airline" field, and the name is captured in the group named "source". If then second alternative matches, you're looking at the date, which is captured in the "date" group.

But, because the fields aren't in a predetermined order, the regex has to match the whole string to be sure of matching both fields (in fact, I should have used start and end anchors--^ and $--to enforce that; I've added them below). The third alternative, [^/]+, allows it to consume the parts that the first two can't, thus making an overall match possible. Here's the updated regex:

^(?:\/BS-(?P<source>\w+)|\/DS-(?P<date>\w+)|[^\/\v]+)+$

...and the updated demo. As noted in the comment, the \v is there only because I'm combining your two examples into one multiline string and doing two matches. You shouldn't need it in real life.


This gives you both strings filled either in match groups airline1+date1 or in airline2+date2:

((BS-(?<airline1>\w+).*DS-(?<date1>[\w]+))|(DS-(?<date2>[\w]+).*BS-(?<airline2>\w+)))

>> view at regex101.com

Since there are only 2 groups, I used simple permutation.

This regex will take last of occurrences, if there are more. If you need earliest one (using lookbehind), let me know.


Need Your Help

Develop Graph on J2ME

graph canvas java-me

I going to develop Graph on my j2me application. I have taken Canvas &amp; I wann to draw graph on particular data then how i can draw graph ?