我是编程语言的初学者,我正在学习scrape。是否可以在这样的注释中获取数据?
<tbody id="the-list">
<tr>
<td valign="top" align="right">1.</td>
<td valign="top">BEKASI</td>
<td valign="top">Tambun</td>
<td valign="top">Selatan</td>
<td valign="top">01.4.13.16.06.000013</td>
<td valign="top">Jalan</td>
<td valign="top">PERUM BEKASI GRIYA ASRI</td>
<td valign="top">1.500 m<sup>2</sup></td>
<td valign="top" align="center">Kantor</td>
<td valign="top">400 m<sup>2</sup></td>
<td valign="top" align="center">1998</td>
<td valign="top" align="center">> 200</td>
<!--
<td valign="top" align="center">-6.2245</td>
<td valign="top" align="center">107.0827</td>
-->
<td valign="top" align="right">3</td>
<td valign="top" align="right">7</td>
<td valign="top" align="right">2</td>
<td valign="top" align="right">150</td>
<td valign="top">08888123</td>
<td valign="top">-</td>
</tr>
我希望结果可以像这样
1.;BEKASI;Tambun;Selatan;01.4.13.16.06.000013;Jalan;PERUM BEKASI GRIYA ASRI;1.500 m;Kantor;400 m;1998;200;-6.2245;107.0827;3;7;2;150;08888123;-
答案 0 :(得分:0)
goquery是解析HTML内容的绝佳库。
html := `
<table><tbody id="the-list">
<tr>
<td valign="top" align="right">1.</td>
<td valign="top">BEKASI</td>
<td valign="top">Tambun</td>
<td valign="top">Selatan</td>
<td valign="top">01.4.13.16.06.000013</td>
<td valign="top">Jalan</td>
<td valign="top">PERUM BEKASI GRIYA ASRI</td>
<td valign="top">1.500 m<sup>2</sup></td>
<td valign="top" align="center">Kantor</td>
<td valign="top">400 m<sup>2</sup></td>
<td valign="top" align="center">1998</td>
<td valign="top" align="center">> 200</td>
<!--
<td valign="top" align="center">-6.2245</td>
<td valign="top" align="center">107.0827</td>
-->
<td valign="top" align="right">3</td>
<td valign="top" align="right">7</td>
<td valign="top" align="right">2</td>
<td valign="top" align="right">150</td>
<td valign="top">08888123</td>
<td valign="top">-</td>
</tr>
</tbody></table>
`
doc, _ := goquery.NewDocumentFromReader(strings.NewReader(html))
sel := doc.Find("#the-list td")
for i := range sel.Nodes{
n := sel.Eq(i)
fmt.Println(n.Text())
}