regexp target last main li in list
Is there a simple/elegant regexp that can help with this? I'm no guru w/ them, but it appears the need for greedy/non-greedy selectors when I'm selecting all the middle text (.*) / (.+) changes as nested lists are added and moved around in the list - and this is throwing me off.
$pattern = '/^(<ul>.*)<li>(.+<\/li><\/ul>)$/'; $replacement = '$1<li id="lastLi">$3';
Perhaps there is an easier approach?? converting to XML to target the LI and then convert back?
ie: Single Element
<ul> <li>TARGET</li> </ul>
<ul> <li>foo</li> <li>TARGET</li> </ul>
Nested Lists before end
<ul> <li> foo <ul> <li>bar</li> </ul> <li> <li>TARGET</li> </ul>
Nested List at end
<ul> <li>foo</li> <li> TARGET <ul> <li>bar</li> </ul> </li> </ul>
You should never use regex to parse HTML. Especially in this particular case (recursive tags).
Main reason overall is that HTML is not a regular language.
On top of the fact that HTML is not a regular language and can't be 100% correctly parsed with regex, the task to regex-parse HTML "well enough" is complicated enough that you're more likely than not going to have bugs in your code.
Instead, use a designated HTML parser.
Use an html parser not a regex.
XML conversion and DOM parsing is the easiest way if there is enough confidence about what kind of HTML data must be processed through.