如何选择具有特定子项的标记之后的p-tag?使用网络爬虫。 http://symfony.com/doc/current/components/css_selector.html
$crawler->filter('h2 span#hello + p')->each(function ($node) {
var_dump($node->html());
});
示例:
<h2><span id="hello">Hi</span></h2>
<p>I want this p-tag, that is after the h2 above</p>
<p>me too!</p>
<a>Not me!</a>
<h2>lol</h2>
<p>yo, not me</p>
不起作用。
答案 0 :(得分:0)
通常最好使用DOMDocument类(http://php.net/manual/en/class.domdocument.php)遍历HTML,但是你可以使用正则表达式来执行它:
// put the example HTML code into a string
$html = <<< EOF
<h2><span id="hello">Hi</span></h2>
<p>I want this p-tag, that is after the h2 above</p>
<p>me too!</p>
<a>Not me!</a>
<h2>lol</h2>
<p>yo, not me</p>
EOF;
// set up a regular expression
$re = "/<h2[^>]*>.*?<span[^>]*id=\"hello\"[^>]*>.*?<\\/h2[^>]*>.*?(<p.*?)<[^\\/p]/sim";
// get the match ... the (.*?) in the above regex
preg_match($re,$html,$matches);
print $matches[1];
输出:
<p>I want this p-tag, that is after the h2 above<p>
<p>me too!</p>