Extract DOM-elements from string, in PHP

Possible Duplicates: crawling a html page using php? Best methods to parse HTML

I have one string-variable in my php-script, that contains html-page. How i can extract DOM-elements from this string?

For example, in this string '<div class="someclass">text</div>', i wish get variable 'text'. How i can do this?

Answers


You need to use the DOMDocument class, and, more specifically, its loadHTML method, to load your HTML string to a DOM object.

For example :

$string = <<<HTML
<p>test</p>
<div class="someclass">text</div>
<p>another</p>
HTML;

$dom = new DOMDocument();
$dom->loadHTML($string);

After that, you'll be able to manipulate the DOM, using for instance the DOMXPath class to do XPath queries on it.

For example, in your case, you could use something based on this portion of code :

$xpath = new DOMXpath($dom);
$result = $xpath->query('//div[@class="someclass"]');
if ($result->length > 0) {
    var_dump($result->item(0)->nodeValue);
}

Which, here, would get you the following output :

string 'text' (length=4)

As an alternative, instead of DOMDocument, you could also use simplexml_load_string and SimpleXMLElement::xpath -- but for complex manipulations, I generally prefer using DOMDocument.


Have a look at DOMDocument and DOMXPath.

$DOM = new DOMDocument();
$DOM->loadHTML($str);

$xpath = new DOMXPath($DOM);
$someclass_elements = $xpath->query('//[@class = "someclass"]');
// ...

Need Your Help

Java 8: virtual extension methods vs abstract class

java interface java-8 abstract-class

I'm looking at the new virtual extension methods in Java 8 interfaces:

Is Crockford style Context Coloring implemented in any code editor?

javascript editor syntax-highlighting sublimetext2

I watched a video from YUIConf 2012 in which Douglas Crockford gives a talk about implementing monads in JavaScript. In this talk he gives a code example that utilizes what he calls "Context Colori...