我正在尝试使用xpath
提取2位数据这是我的代码:
<?php
$curl = curl_init('http://www.livescore.com/soccer/england/league-2/');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10');
$html = curl_exec($curl);
curl_close($curl);
if (!$html)
{
die("something's wrong!");
}
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$result = $xpath->query("/html/body/div[2]/div[5]/div[contains(@class, 'row')]");
var_dump ($result);
foreach($result as $row)
{
$text = $row->nodeValue;
$href = $row->getAttribute("href");
//getAttribute("href")
$array[] = array
(
'text' => trim($text),
'href' => $href
);
}
print "<pre>";
var_dump ($array);
?>
我只是无法提取href链接!!任何帮助都会非常受欢迎。非常感谢
答案 0 :(得分:2)
首先,该页面中的数据行可以通过更具体的类名row-gray
找到。然后,要获取当前div
中的链接,您可以使用相对XPath表达式.//a[@class='scorelink']
:
$result = $xpath->query("//div[contains(@class, 'row-gray')]");
foreach($result as $row)
{
$text = $row->nodeValue;
$link = $xpath->query(".//a[@class='scorelink']", $row)->item(0);
$href = $link->getAttribute("href");
$array[] = array
(
'text' => trim($text),
'href' => $href
);
}