我正在尝试使用php的domDocument从一些数据中提取href。
以下为网址提取锚点,但我想要网址
$events[$i]['race_1'] = trim($cols->item(1)->nodeValue);
如果它有帮助,可以使用以下代码。
// initialize loop
$i = 0;
// new dom object
$dom = new DOMDocument();
//load the html
$html = @$dom->loadHTMLFile($url);
//discard white space
$dom->preserveWhiteSpace = true;
//the table by its tag name
$information = $dom->getElementsByTagName('table');
$rows = $information->item(4)->getElementsByTagName('tr');
foreach ($rows as $row)
{
$cols = $row->getElementsByTagName('td');
$events[$i]['title'] = trim($cols->item(0)->nodeValue);
$events[$i]['race_1'] = trim($cols->item(1)->nodeValue);
$events[$i]['race_2'] = trim($cols->item(2)->nodeValue);
$events[$i]['race_3'] = trim($cols->item(3)->nodeValue);
$date = explode('/', trim($cols->item(4)->nodeValue));
$events[$i]['month'] = $date['0'];
$events[$i]['day'] = $date['1'];
$citystate = explode(',', trim($cols->item(5)->nodeValue));
$events[$i]['city'] = $citystate['0'];
$events[$i]['state'] = $citystate['1'];
$i++;
}
print_r($events);
以下是TD标签的内容
<td width="12%" align="center" height="13"><!--mstheme--><font face="Arial"><span lang="en-us"><b> <font style="font-size: 9pt;" face="Verdana"> <a linkindex="18" target="_blank" href="results2010/brmc5k10.htm">Overall</a>
答案 0 :(得分:4)
更新,我看到了这个问题。您需要从a
获取td
元素列表。
$cols = $row->getElementsByTagName('td');
// $cols->item(1) is a td DOMElement, so have to find anchors in the td element
// then get the first (only) ancher's href attribute
// (chaining looks long, might want to refactor/check for nulls)
$events[$i]['race_1'] = trim($cols->item(1)->getElementsByTagName('a')->item(0)->getAttribute('href');
非常确定您应该可以在该项目上调用getAttribute()
。您可以验证该项是nodeType XML_ELEMENT_NODE
;如果该项不是DOMElement,它将返回一个空字符串。
<?php
// ...
$events[$i]['race_1'] = trim($cols->item(1)->getAttribute('href'));
// ...
?>
参见相关内容:DOMNode to DOMElement in php