我想刮擦这张桌子,并获取其所有详细信息。 html代码是这样的:
<table id="bnConnectionTemplate:r1:0:tl1" class="detailTable" cellpadding="0" cellspacing="0" border="0" summary="">
<tbody>
<tr>
<th>Name: </th>
<td>EVERBRITE CORPORATION LIMITED</td>
</tr>
<tr>
<th><abbr title="Australian Company Number">ACN: </abbr></th>
<td>104 436 704</td>
</tr>
<tr>
<th><abbr title="Australian Business Number">ABN: </abbr></th>
<td><a id="bnConnectionTemplate:r1:0:j_id__ctru57pc2" class="contentLink af_goLink" href="http://abr.business.gov.au/Search.aspx?SearchText=96%20104%20436%20704" target="_blank"><span title="">96 104 436 704</span><span class="hiddenHint"> (External Link)</span></a></td>
</tr>
<tr>
<th>Registration date: </th>
<td>15/04/2003</td>
</tr>
<tr>
<th>Next review date: </th>
<td>15/04/2013</td>
</tr>
<tr>
<th>Former name(s): </th>
<td>VISIONGLOW GLOBAL LIMITED</td>
</tr>
<tr>
<th></th>
<td></td>
</tr>
<tr>
<th>Status: </th>
<td>Deregistered</td>
</tr>
<tr>
<th>Date deregistered: </th>
<td>7/09/2012</td>
</tr>
<tr>
<th>Type: </th>
<td>Australian Public Company, Limited By Shares</td>
</tr>
<tr>
<th>Locality of registered office: </th>
<td></td>
</tr>
<tr>
<th>Regulator: </th>
<td>Australian Securities & Investments Commission</td>
</tr>
</tbody>
我的问题是,即使我尝试通过其类或ID获取该表,也无法获得该表。
# noinspection PyUnresolvedReferences
import requests
# noinspection PyUnresolvedReferences
from bs4 import BeautifulSoup
source = requests.get("https://connectonline.asic.gov.au/RegistrySearch/faces/landing/panelSearch.jspx?searchText=104+436+704&searchType=OrgAndBusNm&_adf.ctrl-state=139sjjyk9g_15").text
soup = BeautifulSoup(source, 'lxml')
我尝试做:
table = soup.find('table', class_= 'detailTable') # Gives output : none
table = soup.find('table', id="bnConnectionTemplate:r1:0:tl1") # Gives output : none
在这一点上,我对为什么会发生这种情况感到困惑。过去,我已经使用这些命令进行了爬网,并且它们运行良好,希望能提供任何帮助。