只是尝试从AEC网站提取一些信息(例如http://apps.aec.gov.au/eSearch/LocalitySearchResults.aspx?filter=3977&filterby=Postcode)。我正在运行的XPath查询是" //x:tbody/x:tr/x:td[4]/x:a
",我已经在XPath Checker(Firefox扩展程序)中测试了它,它会提取相关的位置数据。
然后我使用PHP加载页面,执行查询然后遍历结果。
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$html = curl_exec($ch);
curl_close($ch);
# Create a DOM parser object
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$xpath = new DOMXpath($dom);
$elements = $xpath->query( '//tbody/tr/td[4]/a');
foreach ($elements as $element) {
echo $element;
}
我接着:
Warning: Invalid argument supplied for foreach() in /home/givesh5/public_html/dig/electoratesearch.php on line 41
似乎查询返回某种布尔值而不是查询匹配列表?
相关标记如下:
<table cellspacing="0" rules="all" border="1" id="ContentPlaceHolderBody_gridViewLocalities" style="border-collapse:collapse;">
<tr class="headingLink">
<th scope="col"><a href="javascript:__doPostBack('ctl00$ContentPlaceHolderBody$gridViewLocalities','Sort$StateAb')">State</a></th><th scope="col"><a href="javascript:__doPostBack('ctl00$ContentPlaceHolderBody$gridViewLocalities','Sort$LocalityNm')">Locality/Suburb</a></th><th scope="col"><a href="javascript:__doPostBack('ctl00$ContentPlaceHolderBody$gridViewLocalities','Sort$Postcode')">Postcode</a></th><th scope="col"><a href="javascript:__doPostBack('ctl00$ContentPlaceHolderBody$gridViewLocalities','Sort$DivisionNm')">Electorate</a></th><th scope="col"><a href="javascript:__doPostBack('ctl00$ContentPlaceHolderBody$gridViewLocalities','Sort$DivisionNmRedistributed')">Redistributed Electorate</a></th><th scope="col">Other Locality(s)</th>
</tr><tr>
<td>VIC</td><td>BOTANIC RIDGE</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CANNONS CREEK</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Holt&filterby=Electorate&divid=216">Holt</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE EAST</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE EAST</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Holt&filterby=Electorate&divid=216">Holt</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE NORTH</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Holt&filterby=Electorate&divid=216">Holt</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE SOUTH</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE WEST</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Holt&filterby=Electorate&divid=216">Holt</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>DEVON MEADOWS</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>FIVEWAYS</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td><a href="LocalitySearchResults.aspx?filter=DEVON+MEADOWS&filterby=LocalityorSuburb&state=VIC">DEVON MEADOWS</a></td>
</tr><tr>
<td>VIC</td><td>JUNCTION VILLAGE</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>SANDHURST</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Isaacs&filterby=Electorate&divid=219">Isaacs</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>SKYE</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Dunkley&filterby=Electorate&divid=210">Dunkley</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>SKYE</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Isaacs&filterby=Electorate&divid=219">Isaacs</a></td><td></td><td> </td>
</tr>
</table>
答案 0 :(得分:1)
似乎查询返回某种布尔值而不是查询匹配列表?
是的,它可以返回一个布尔值,然后它将是FALSE
。它表示存在运行xpath查询的错误。这可能是由传递给DOMXpath::query()
Php Manual的两个参数之一引起的, xpath表达式或上下文节点。
在您的情况下,您只使用一个参数,因此这表示xpath表达式是错误的。但是,您使用的那个没有错,并且不会导致布尔FALSE
。但是当你遇到这个错误我假设可能有其他错误,所以可能xpath对象没有完全初始化,但即使没有或部分下载我模拟我无法重现错误。它可能与PHP版本有所不同?我不知道。
对于实际的xpath表达式,它应用 adeneo 和 Gordon 已写入的内容, <tbody>
- 元素插入到Firefox中的DOM,PHP中的DOMDocument实现在这里表现不同。您可以在这里模仿Firefox(更多工作) - 或者 - 您只是搜索实际的表元素,然后它工作正常。这是一个有效的例子:
$url = 'http://apps.aec.gov.au/eSearch/LocalitySearchResults.aspx?filter=3977&filterby=Postcode';
# Create a DOMDocument to parse HTML
$doc = new DOMDocument();
$saved = libxml_use_internal_errors(true);
$result = $doc->loadHTMLFile($url);
libxml_use_internal_errors($saved);
if (false === $result) {
throw new UnexpectedValueException(sprintf('Failed to create DOMDocument from url %s', var_export($url, true)));
}
# Create a DOMXPath to get data from HTML document
$xpath = new DOMXpath($doc);
$expression = '//table/tr/td[4]/a';
$elements = $xpath->query($expression);
if (false === $elements) {
throw new UnexpectedValueException(sprintf('The xpath expression %s failed', var_export($expression, true)));
}
foreach ($elements as $index => $element) {
printf("#%02d: %s\n", $index + 1, trim($element->textContent));
}
示例输出:
#01: Flinders
#02: Flinders
#03: Holt
#04: Flinders
#05: Holt
#06: Holt
#07: Flinders
#08: Holt
#09: Flinders
#10: Flinders
#11: Flinders
#12: Isaacs
#13: Dunkley
#14: Isaacs
答案 1 :(得分:0)
该HTML中没有tbody
浏览器会在需要时插入tbody
个元素,但我们没有使用浏览器,我们使用的DOMDocument
没有插入tbody
元素。
相反,tr
元素是表格的直接子元素
$elements = $xpath->query( '//table/tr/td[4]/a');
foreach ($elements as $element) {
echo $dom->saveHTML($element);
}