import requests
headers ={
"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding":"gzip, deflate",
"Accept-Language":"en-US,en;q=0.5",
"Connection":"keep-alive",
"Host":"mcfbd.com",
"Referer":"https://mcfbd.com/mcf/FrmView_PropertyTaxStatus.aspx",
"User-Agent":"Mozilla/5.0(Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0"}
a = requests.session()
soup = BeautifulSoup(a.get("https://mcfbd.com/mcf/FrmView_PropertyTaxStatus.aspx").content)
payload = {"ctl00$ContentPlaceHolder1$txtSearchHouse":"",
"ctl00$ContentPlaceHolder1$txtSearchSector":"",
"ctl00$ContentPlaceHolder1$txtPropertyID":"",
"ctl00$ContentPlaceHolder1$txtownername":"",
"ctl00$ContentPlaceHolder1$ddlZone":"1",
"ctl00$ContentPlaceHolder1$ddlSector":"2",
"ctl00$ContentPlaceHolder1$ddlBlock":"2",
"ctl00$ContentPlaceHolder1$btnFind":"Search",
"__VIEWSTATE":soup.find('input',{'id':'__VIEWSTATE'})["value"],
"__VIEWSTATEGENERATOR":"14039419",
"__EVENTVALIDATION":soup.find("input",{"name":"__EVENTVALIDATION"})["value"],
"__SCROLLPOSITIONX":"0",
"__SCROLLPOSITIONY":"0"}
b = a.post("https://mcfbd.com/mcf/FrmView_PropertyTaxStatus.aspx",headers = headers,data = payload).text
print(b)
以上是我对本网站的代码。
https://mcfbd.com/mcf/FrmView_PropertyTaxStatus.aspx
我检查了firebug,这些是表单数据的值。 但这样做:
b = requests.post("https://mcfbd.com/mcf/FrmView_PropertyTaxStatus.aspx",headers = headers,data = payload).text
print(b)
抛出此错误:
[ArgumentException]: Invalid postback or callback argument
我对通过请求提交表单的理解是正确的吗?
1.打开萤火虫
2.提交表单
3.转到NET标签
4.在NET标签中选择帖子标签
5.copy表格数据,如上面的代码
我一直想知道如何做到这一点。我可以使用硒,但我想我会尝试新的东西并使用请求
答案 0 :(得分:2)
您收到的错误是正确的,因为_VIEWSTATE(以及其他字段)等字段不是静态或硬编码的。正确的方法如下:
创建一个请求会话对象。此外,建议使用包含USER-AGENT字符串 -
的标头进行更新headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36",}`
s = requests.session()
导航到指定的网址 -
r = s.get(url)
使用BeautifulSoup4解析返回的html -
from bs4 import BeautifulSoup
soup = BeautifulSoup(r.content, 'html5lib')
使用硬编码值和动态值填充formdata -
formdata = {
'__VIEWSTATE': soup.find('input', attrs={'name': '__VIEWSTATE'})['value'],
'field1': 'value1'
}
然后使用会话对象本身发送POST请求 -
s.post(url, data=formdata)