I'm looking for a way to crawl in php the value of a <a>
that does not have a class or id, but that is inside a <div>
that has a class.
Here is the html code to crawler:
<div class="myclass">
<a href="/to">value to crawl</a>
</div>
Here is the line of my php code (unsuccessfully):
preg_match_all('<div class=\"myclass\"><a>(.*)<\/a><\/div>', $myhtml, $match);
thank for your response :)
答案 0 :(得分:1)
解析器是一个更好的解决方案:
$html = '<div class="myclass">
<a href="/to">value to crawl</a>
</div>';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXpath($dom);
$a_s = $xpath->query('*/div[contains(@class, \'myclass\')]/a');
foreach($a_s as $a) {
if(empty($a->getAttribute('class')) && empty($a->getAttribute('id'))) {
echo $a->nodeValue;
} else {
echo 'not';
}
}
你的问题的答案是:
<a>
在您的字符串中不存在><
也不存在于您的字符串所以要纠正你的正则表达式,那就是:
/<div class="myclass">\s*<a.*?>(.*?)<\/a>\s*<\/div>/