我有来自维基百科页面的以下html字符串片段...
do(x => isNaN(x.index) ? x.index = 0 : x.index++)
我有以下php代码....
<table class="wikitable">
<tbody>
<tr>
<td>mod_access</td>
<td>Versions older than 2.1</td>
<td>Included by Default</td>
</tr>
<tr>
<td>mod_actions</td>
<td>Versions 1.1 and later</td>
<td>Included by Default</td>
</tr>
<tr>
<td>mod_alias</td>
<td>Versions 1.1 and later</td>
<td>Included by Default</td>
</tr>
</tr>
</tbody>
我想要的是一个数字数组,每个索引为ini_set('display_errors','On');
$url="https://en.wikipedia.org/wiki/List_of_Apache_modules";
$dom=new DomDocument();
$dom->preserveWhiteSpace=false;
$dom->loadHtmlFile($url);
$xpath=new DomXpath($dom);
$elements=$xpath->query('//*[@id="mw-content-text"]/div/table/tbody/tr/td');
foreach($elements as $i=>$row){
$tds=$xpath->query('td',$row);
foreach($tds as $td){
echo "Td($i):", $td->nodeValue,"\n";
}
}
。
不太确定下一步该做什么。
答案 0 :(得分:1)
如果您从第一个xpath查询中删除tbody
和td
,它将找到所有tr
个元素:
$elements = $xpath->query('//*[@id="mw-content-text"]/div/table/tr');
然后,您可以遍历每一行,使用现有代码查找td
元素,并将它们添加到数组中:
$data = array();
foreach ($elements as $y => $row) {
$tds = $xpath->query('td', $row);
foreach($tds as $x => $td) {
$data[$y][$x] = $td->nodeValue;
}
}
var_dump($data);
使用php 5.6测试,给出了这个输出:
array(157) {
[1]=>
array(6) {
[0]=>
string(10) "mod_access"
[1]=>
string(23) "Versions older than 2.1"
[2]=>
string(19) "Included by Default"
[3]=>
string(26) "Apache Software Foundation"
[4]=>
string(27) "Apache License, Version 2.0"
[5]=>
string(71) "Provides access control based on the client and the client's request[2]"
}
[2]=>
array(6) {
[0]=>
string(11) "mod_actions"
[1]=>
string(22) "Versions 1.1 and later"
[2]=>
string(19) "Included by Default"
[3]=>
string(26) "Apache Software Foundation"
[4]=>
string(27) "Apache License, Version 2.0"
[5]=>
string(62) "Provides CGI ability based on request method and media type[3]"
}
// etc ...