在具有id的子项的h2之后选择p标记

时间:2015-09-29 12:35:33

标签: php symfony

如何选择具有特定子项的标记之后的p-tag?使用网络爬虫。 http://symfony.com/doc/current/components/css_selector.html

$crawler->filter('h2 span#hello + p')->each(function ($node) {
    var_dump($node->html());
});

示例:

<h2><span id="hello">Hi</span></h2>
<p>I want this p-tag, that is after the h2 above</p>
 <p>me too!</p>
<a>Not me!</a>
<h2>lol</h2>
<p>yo, not me</p>

不起作用。

1 个答案:

答案 0 :(得分:0)

通常最好使用DOMDocument类(http://php.net/manual/en/class.domdocument.php)遍历HTML,但是你可以使用正则表达式来执行它:

// put the example HTML code into a string
$html = <<< EOF
<h2><span id="hello">Hi</span></h2>
<p>I want this p-tag, that is after the h2 above</p>
 <p>me too!</p>
<a>Not me!</a>
<h2>lol</h2>
<p>yo, not me</p>
EOF;

// set up a regular expression
$re = "/<h2[^>]*>.*?<span[^>]*id=\"hello\"[^>]*>.*?<\\/h2[^>]*>.*?(<p.*?)<[^\\/p]/sim";
// get the match ... the (.*?) in the above regex
preg_match($re,$html,$matches);

print $matches[1];

输出:

<p>I want this p-tag, that is after the h2 above<p>

<p>me too!</p>