我有一个链接,该链接是html表单提交的结果:
https://www.taxpayerservicecenter.com/RP_Detail.jsp?ssl=4204%20%20%20%200084
在浏览器中使用检查,我认为表数据位于这样的元素中:
<form action="./RP_Results.jsp" id="SearchForm" method="post" name="SearchForm" onsubmit="return validateForm(document.SearchForm)">
当我使用漂亮的汤时,我似乎无法访问此td类。我看到了:
from bs4 import BeautifulSoup
import requests
page = requests.get("https://www.taxpayerservicecenter.com/RP_Detail.jsp?ssl=4204%20%20%20%200084")
page
soup = BeautifulSoup(page.content,'lxml')
soup
有人知道如何获取此表数据吗?这就是我尝试过的。
The query returns some columns [mContactId, mAddress, mPostcode, mCity, mCountry, mAddressType]
which are not used by org.linphone.contacts.managementWS.ContactWithAddresses. You can use
@ColumnInfo annotation on the fields to specify the mapping.
org.linphone.contacts.managementWS.ContactWithAddresses has some fields [mName, mSurname,
mFullName, mCompany, mNote, mIsBlocked] which are not returned by the query. If they are not
supposed to be read from the result, you can mark them with @Ignore annotation. You can suppress
this warning by annotating the method with @SuppressWarnings(RoomWarnings.CURSOR_MISMATCH).
Columns returned by the query: id, mContactId, mAddress, mPostcode, mCity, mCountry,
mAddressType. Fields in org.linphone.contacts.managementWS.ContactWithAddresses: id, mName,
mSurname, mFullName, mCompany, mNote, mIsBlocked.
答案 0 :(得分:1)
您需要在get请求中设置JSESSIONID
Cookie
标头,才能“查看”表
如下修改您的获取请求
page = requests.get(url, headers={
'Cookie': 'JSESSIONID=11qfsCuAhlev3j943gEn8bf-CBfH8Ta_z858JNR9w__7PJOfxkWr!-965451614'
})
注意:您可以使用“网络”标签中的Chrome / Firefox开发工具获取JSESSIONID,然后单击第一个请求