删除特定的表DOMXPath

时间:2015-11-22 17:34:19

标签: php dom xpath

我有这段代码并使用DOMXPath删除给定的td

$html = file_get_contents('WebProxy.html');
$xml = new DOMDocument();
$xml->validateOnParse = true;
@$xml->loadHTML($html);

$xpath = new DOMXPath($xml);
$table =$xpath->query("//*[@id='proxylisttable']")->item(0);

// for printing the whole html table just type: print $xml->saveXML($table); 

$rows = $table->getElementsByTagName("tr");

foreach ($rows as $row) {
  $cells = $row -> getElementsByTagName('td');
  foreach ($cells as $cell) {
        echo $cell->nodeValue. . '<br>';
  }
}

我会像那样删除td3,TD6,td7,td8

WebProxy.html

td1: <td class=" ">116.226.187.242</td>
td2: <td class=" ">1080</td>
td3: <td class=" ">CN</td>
td4: <td class=" ">China</td>
td5: <td class=" ">Socks4</td>
td6: <td class=" ">Anonymous</td>
td7: <td class=" ">Yes</td>
td8: <td class=" ">1 minute ago</td>
</tr>
<tr class="even">
td1: <td class=" ">23.254.153.205</td>
td2: <td class=" ">60088</td>
td3: <td class=" ">US</td>
td4: <td class=" ">United States</td>
td5: <td class=" ">Socks5</td>
td6: <td class=" ">Anonymous</td>
td7: <td class=" ">Yes</td>
td8: <td ctd1: lass=" ">1 minute ago</td>
</tr>
<tr class="odd">
td1: <td class=" ">46.101.208.9</td>
td2: <td class=" ">1080</td>
td3: <td class=" ">DE</td>
td4: <td class=" ">Germany</td>
td5: <td class=" ">Socks4</td>
td6: <td class=" ">Anonymous</td>
td7: <td class=" ">Yes</td>
td8: <td class=" ">1 minute ago</td>
</tr>

要成为这样:

116.226.187.242
    1080
    China
    Socks4

...

我怎么能这样做,谢谢你的帮助

1 个答案:

答案 0 :(得分:0)

也许您可以通过排除您不想要的单元格来调整您的xpath查询。

例如:

<?php
$html = file_get_contents('WebProxy.html');
$xml = new DOMDocument();
$xml->validateOnParse = true;
@$xml->loadHTML($html);

$xpath = new DOMXPath($xml);
$cells =$xpath->query("//*[@id='proxylisttable']/tr/td[position() != 3 and position() != 6 and position() != 7 and position() != 8]");

foreach ($cells as $cell) {
    echo $cell->nodeValue . "<br>";
}