我有一些代码,我试图从外部页面中提取2个独立表的值,并为每个行/列创建一个数组。这是2个表格的html。
表1
<table class="report" cellspacing="0" >
<thead>
<tr>
<th>Team</th>
<th>Win %</th>
<th>Games</th>
<th>Wins</th>
</tr>
</thead>
<tbody>
<tr>
<td style="font-weight:bold;"> Division: A </td>
</tr>
<tr>
<td> Team 1 </td>
<td class='rightaligned'> 98.0 </td>
<td class='rightaligned'> 51 </td>
<td class='rightaligned'> 50 </td>
</tr>
<tr>
<td> Team 6 </td>
<td class='rightaligned'> 76.5 </td>
<td class='rightaligned'> 51 </td>
<td class='rightaligned'> 39 </td>
</tr>
<tr>
<td> Team 8 </td>
<td class='rightaligned'> 56.9 </td>
<td class='rightaligned'> 51 </td>
<td class='rightaligned'> 29 </td>
</tr>
<tr>
<td> Team 4 </td>
<td class='rightaligned'> 73.5 </td>
<td class='rightaligned'> 34 </td>
<td class='rightaligned'> 25 </td>
</tr>
<tr>
<td> Team 9 </td>
<td class='rightaligned'> 43.1 </td>
<td class='rightaligned'> 51 </td>
<td class='rightaligned'> 22 </td>
</tr>
<tr>
<td> Team 5 </td>
<td class='rightaligned'> 47.1 </td>
<td class='rightaligned'> 34 </td>
<td class='rightaligned'> 16 </td>
</tr>
<tr>
<td> Team 10 </td>
<td class='rightaligned'> 29.4 </td>
<td class='rightaligned'> 51 </td>
<td class='rightaligned'> 15 </td>
</tr>
<tr>
<td> Team 7 </td>
<td class='rightaligned'> 25.5 </td>
<td class='rightaligned'> 51 </td>
<td class='rightaligned'> 13 </td>
</tr>
<tr>
<td> Team 2 </td>
<td class='rightaligned'> 20.6 </td>
<td class='rightaligned'> 34 </td>
<td class='rightaligned'> 7 </td>
</tr>
<tr>
<td> Team 3 </td>
<td class='rightaligned'> 14.7 </td>
<td class='rightaligned'> 34 </td>
<td class='rightaligned'> 5 </td>
</tr>
</tbody>
</table>
表2
<table class="report" cellspacing="0" >
<thead>
<tr>
<th>Team</th>
<th>Against</th>
<th>Date</th>
<th>Week</th>
<th>Games</th>
<th>Wins</th>
<th>Losses</th>
<th>Forfeits</th>
</tr>
</thead>
<tbody>
<tr>
<td> Team 1 </td>
<td> Team 7 </td>
<td class='rightaligned'> 09/19/2017 </td>
<td class='rightaligned'> 2 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 0 </td>
<td class='rightaligned'> 0 </td>
</tr>
<tr>
<td> Team 8 </td>
<td> Team 9 </td>
<td class='rightaligned'> 09/19/2017 </td>
<td class='rightaligned'> 2 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 14 </td>
<td class='rightaligned'> 3 </td>
<td class='rightaligned'> 0 </td>
</tr>
<tr>
<td> Team 6 </td>
<td> Team 10 </td>
<td class='rightaligned'> 09/19/2017 </td>
<td class='rightaligned'> 2 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 14 </td>
<td class='rightaligned'> 3 </td>
<td class='rightaligned'> 0 </td>
</tr>
<tr>
<td> Team 5 </td>
<td> Team 4 </td>
<td class='rightaligned'> 09/12/2017 </td>
<td class='rightaligned'> 1 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 9 </td>
<td class='rightaligned'> 8 </td>
<td class='rightaligned'> 0 </td>
</tr>
<tr>
<td> Team 4 </td>
<td> Team 5 </td>
<td class='rightaligned'> 09/12/2017 </td>
<td class='rightaligned'> 1 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 8 </td>
<td class='rightaligned'> 9 </td>
<td class='rightaligned'> 0 </td>
</tr>
<tr>
<td> Team 2 </td>
<td> Team 7 </td>
<td class='rightaligned'> 09/12/2017 </td>
<td class='rightaligned'> 1 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 4 </td>
<td class='rightaligned'> 13 </td>
<td class='rightaligned'> 0 </td>
</tr>
<tr>
<td> Team 9 </td>
<td> Team 8 </td>
<td class='rightaligned'> 09/19/2017 </td>
<td class='rightaligned'> 2 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 3 </td>
<td class='rightaligned'> 14 </td>
<td class='rightaligned'> 0 </td>
</tr>
<tr>
<td> Team 10 </td>
<td> Team 6 </td>
<td class='rightaligned'> 09/19/2017 </td>
<td class='rightaligned'> 2 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 3 </td>
<td class='rightaligned'> 14 </td>
<td class='rightaligned'> 0 </td>
</tr>
<tr>
<td> Team 3 </td>
<td> Team 6 </td>
<td class='rightaligned'> 09/12/2017 </td>
<td class='rightaligned'> 1 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 2 </td>
<td class='rightaligned'> 15 </td>
<td class='rightaligned'> 0 </td>
</tr>
<tr>
<td> Team 7 </td>
<td> Team 1 </td>
<td class='rightaligned'> 09/19/2017 </td>
<td class='rightaligned'> 2 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 0 </td>
<td class='rightaligned'> 17 </td>
<td class='rightaligned'> 0 </td>
</tr>
</tbody>
</table>
使用下面的代码,我可以将第一个表的值拉到数组中。
<?php
$url = '***';
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0", // something like Firefox
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);
$curl = curl_init($url);
curl_setopt_array( $curl, $options );
$content = curl_exec($curl);
curl_close($curl);
$dom = new DOMDocument();
@$dom->loadHTML($content);
$xpath = new DOMXPath($dom);
$tables = $dom->getElementsByTagName('tbody');
$rows = $tables->item(0)->getElementsByTagName('tr');
foreach ($rows as $row)
{
$cols = $row->getElementsByTagName('td');
$date = $cols->item(0)->nodeValue; $element[$i]['team'] = trim($date);
$percentage = $cols->item(1)->nodeValue; $element[$i]['percentage'] = trim($percentage);
$wins = $cols->item(2)->nodeValue; $element[$i]['wins'] = trim($wins);
$games = $cols->item(3)->nodeValue; $element[$i]['games'] = trim($games);
$i++;
}
echo '<pre>';
print_r ($element);
echo '<pre>';
?>
这里的输出最终会看起来像
Array
(
[] => Array
(
[team] => Division: A
[percentage] =>
[wins] =>
[games] =>
)
[1] => Array
(
[team] => Team 1
[percentage] => 98.0
[wins] => 51
[games] => 50
)
[2] => Array
(
[team] => Team 6
[percentage] => 76.5
[wins] => 51
[games] => 39
)
[3] => Array
(
[team] => Team 8
[percentage] => 56.9
[wins] => 51
[games] => 29
)
[4] => Array
(
[team] => Team 4
[percentage] => 73.5
[wins] => 34
[games] => 25
)
[5] => Array
(
[team] => Team 9
[percentage] => 43.1
[wins] => 51
[games] => 22
)
[6] => Array
(
[team] => Team 5
[percentage] => 47.1
[wins] => 34
[games] => 16
)
[7] => Array
(
[team] => Team 10
[percentage] => 29.4
[wins] => 51
[games] => 15
)
[8] => Array
(
[team] => Team 7
[percentage] => 25.5
[wins] => 51
[games] => 13
)
[9] => Array
(
[team] => Team 2
[percentage] => 20.6
[wins] => 34
[games] => 7
)
[10] => Array
(
[team] => Team 3
[percentage] => 14.7
[wins] => 34
[games] => 5
)
)
现在输出一切正常,但它完全没有第二个表。如何才能获取第二个表信息?
感谢任何输入
答案 0 :(得分:0)
打印表子节点
print_r($tables->childNodes);
现在你知道数组的结构是什么样的,所以在表格中循环,foreach表在行和列中循环。