我在尝试从网站HTML表中抓取som数据时遇到了一些困难。我要检索的标签没有ID或类,因此如果你们可以帮助我,我会很痛苦:
这是表格的外观(由于本文不占用大量空间,因此代码被剪掉了):
<table class="table table-striped table-large1">
<thead>
<tr class="small">
<th>No</th>
<th>Date/Time</th>
<th colspan="7">Indexed pages /<br>
Processed / Skipped / Fetched /<br>
Change (Added / Removed)</th>
<th>Proc.time</th>
<th>Bandwidth</th>
<th>Broken links</th>
<th>Images</th>
<th>Videos</th>
<th>RSS</th>
<th>News</th>
</tr>
</thead>
<tbody><tr class="block1">
<td>1</td>
<td><a href="site/3845806/chlog/?log=8950501" title="View details">2018-06-20 01:13</a></td>
<td>944</td>
<td>969</td>
<td><i><strike>25</strike></i></td>
<td>920</td>
<td><i style="color:#900">↓-2</i></td>
<td><i>-</i></td>
<td><i>-2</i></td>
<td>0:12:44s</td>
<td>28.82M</td>
<td>3</td>
<td>580</td>
<td>4</td>
<td>8</td>
<td>0</td>
</tr>
<tr class="block1">
<td>2</td>
<td><a href="site/3845806/chlog/?log=8934464" title="View details">2018-06-17 01:14</a></td>
<td>946</td>
<td>968</td>
<td><i><strike>22</strike></i></td>
<td>919</td>
<td></td>
<td><i>+2</i></td>
<td><i>-2</i></td>
<td>0:14:05s</td>
<td>28.89M</td>
<td>0</td>
<td>580</td>
<td>4</td>
<td>8</td>
<td>0</td>
</tr>
(........)
我要抓的是这两行:
<td><a href="site/3845806/chlog/?log=8950501" title="View details">2018-06-20 01:13</a></td>
<td>944</td>
这些在每个索引2中,我如何获得所有这些值?
答案 0 :(得分:1)
遍历所有tr
标签并使用jquery的find()
方法定位特定的td元素。然后使用innerHTML = "";
$(".table-large1 tr").each(function() {
if ($(this).find("td").length > 0) {
$(this).find("td")[1].innerHTML = "";
$(this).find("td")[2].innerHTML = "";
}
})
$(".table-large1 tr").each(function() {
if ($(this).find("td").length > 0) {
$(this).find("td")[1].innerHTML = "";
$(this).find("td")[2].innerHTML = "";
}
})
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<table class="table table-striped table-large1">
<thead>
<tr class="small">
<th>No</th>
<th>Date/Time</th>
<th colspan="7">Indexed pages /<br> Processed / Skipped / Fetched /<br> Change (Added / Removed)</th>
<th>Proc.time</th>
<th>Bandwidth</th>
<th>Broken links</th>
<th>Images</th>
<th>Videos</th>
<th>RSS</th>
<th>News</th>
</tr>
</thead>
<tbody>
<tr class="block1">
<td>1</td>
<td><a href="site/3845806/chlog/?log=8950501" title="View details">2018-06-20 01:13</a></td>
<td>944</td>
<td>969</td>
<td><i><strike>25</strike></i></td>
<td>920</td>
<td><i style="color:#900">↓-2</i></td>
<td><i>-</i></td>
<td><i>-2</i></td>
<td>0:12:44s</td>
<td>28.82M</td>
<td>3</td>
<td>580</td>
<td>4</td>
<td>8</td>
<td>0</td>
</tr>
<tr class="block1">
<td>2</td>
<td><a href="site/3845806/chlog/?log=8934464" title="View details">2018-06-17 01:14</a></td>
<td>946</td>
<td>968</td>
<td><i><strike>22</strike></i></td>
<td>919</td>
<td></td>
<td><i>+2</i></td>
<td><i>-2</i></td>
<td>0:14:05s</td>
<td>28.89M</td>
<td>0</td>
<td>580</td>
<td>4</td>
<td>8</td>
<td>0</td>
</tr>
</table>