如何使用td
为表中的每个tr
提取内部值DOM
?我有一张这样的桌子:
<table>
<tbody>
<tr class="rowData">
<td class="cellData">
<a href="#"><span> DATA 1 </span></a>
</td>
<td class="cellData">
<div class="div1"><div class="div2"> DATA 1 a </div></div>
</td>
<td class="cellData">
<div class="div1"><div class="div2"> DATA 1 b </div></div>
</td>
<td class="cellData">
<div class="div1"><div class="div2"> DATA 1 c </div></div>
</td>
</tr>
<tr class="rowData">
<td class="cellData">
<a href="#"><span> DATA 2 </span></a>
</td>
<td class="cellData">
<div class="div1"><div class="div2"> DATA 2 a </div></div>
</td>
<td class="cellData">
<div class="div1"><div class="div2"> DATA 2 b </div></div>
</td>
<td class="cellData">
<div class="div1"><div class="div2"> DATA 2 c </div></div>
</td>
</tr>
<tr class="rowData">
<td class="cellData">
<a href="#"><span> DATA 3 </span></a>
</td>
<td class="cellData">
<div class="div1"><div class="div2"> DATA 3 a </div></div>
</td>
<td class="cellData">
<div class="div1"><div class="div2"> DATA 3 b </div></div>
</td>
<td class="cellData">
<div class="div1"><div class="div2"> DATA 3 c </div></div>
</td>
</tr>
</tbody>
<table>
我得到的是:每行
<label> DATA n </label>
<input value="DATA n a">
<input value="DATA n b">
<input value="DATA n c">
我被这段代码困住了:
$html = file_get_contents($link);
$html2 = (preg_replace('/\s+/', ' ', $html));
$doc = new DOMDocument();
$doc->loadHTML($html2);
$xpath = new DOMXPath($doc);
$tables = $doc->getElementsByTagName('table');
foreach($xpath->query('.//tbody/tr[@class="rowData"]') as $node){
}
foreach($xpath->query('.//tbody/tr/td/div/div[@class="div2"]') as $node){
}
foreach($xpath->query('.//tbody/tr/td/a/span') as $node){
echo $node->nodeValue;
}
有人可以帮助我吗?
答案 0 :(得分:0)
这是一个可能的解决方案,实际上是两个 - 但评论一个太难看了。 :)
$html = file_get_contents($link);
$html2 = (preg_replace('/\s+/', ' ', $html));
$doc = new DOMDocument();
$doc->loadHTML($html2);
$elements = $doc->getElementsByTagName('tr');
foreach($elements as $node){
$inputs1=$node->getElementsByTagName('div')->item(1); // 0,2,4...does same
$inputs2=$node->getElementsByTagName('div')->item(3);
$inputs3=$node->getElementsByTagName('div')->item(5);
echo '<label>'. $node->firstChild->nodeValue. '</label>';
echo '<input value="'. $inputs1->nodeValue. '">';
echo '<input value="'. $inputs2->nodeValue. '">';
echo '<input value="'. $inputs3->nodeValue. '">';
//ugly as hell - but it is working :)
/*echo '<input value="'. $node->firstChild->nextSibling->nextSibling->nodeValue. '">';
echo '<input value="'. $node->firstChild->nextSibling->nextSibling->nextSibling->nextSibling->nodeValue. '">';
echo '<input value="'. $node->firstChild->nextSibling->nextSibling->nextSibling->nextSibling->nextSibling->nextSibling->nodeValue. '">';*/
echo '<br>';
}
答案 1 :(得分:0)
我猜这段代码足够自我解释。 XPath使用了三次:用于查找所有表行,获取标签以及获取所有输入值。
foreach($xpath->query('.//tbody/tr[@class="rowData"]') as $row) {
echo '<label>'.$xpath->query('td[1]/a/span', $row)->item(0)->textContent."</label>\n";
foreach($xpath->query('td[position() > 1]/div/div', $row) as $col) {
echo '<input value="'.trim($col->textContent).'" />'."\n";
}
}