使用.net网站(python)

时间:2015-09-17 19:44:56

标签: python python-requests

import requests

headers ={
"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding":"gzip, deflate",
"Accept-Language":"en-US,en;q=0.5",
"Connection":"keep-alive",
"Host":"mcfbd.com",
"Referer":"https://mcfbd.com/mcf/FrmView_PropertyTaxStatus.aspx",
"User-Agent":"Mozilla/5.0(Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0"}

a = requests.session()
soup = BeautifulSoup(a.get("https://mcfbd.com/mcf/FrmView_PropertyTaxStatus.aspx").content)

payload = {"ctl00$ContentPlaceHolder1$txtSearchHouse":"",
"ctl00$ContentPlaceHolder1$txtSearchSector":"",
"ctl00$ContentPlaceHolder1$txtPropertyID":"",
"ctl00$ContentPlaceHolder1$txtownername":"",
"ctl00$ContentPlaceHolder1$ddlZone":"1",
"ctl00$ContentPlaceHolder1$ddlSector":"2",
"ctl00$ContentPlaceHolder1$ddlBlock":"2",
"ctl00$ContentPlaceHolder1$btnFind":"Search",
"__VIEWSTATE":soup.find('input',{'id':'__VIEWSTATE'})["value"],
"__VIEWSTATEGENERATOR":"14039419",
"__EVENTVALIDATION":soup.find("input",{"name":"__EVENTVALIDATION"})["value"],
"__SCROLLPOSITIONX":"0",
"__SCROLLPOSITIONY":"0"}

b = a.post("https://mcfbd.com/mcf/FrmView_PropertyTaxStatus.aspx",headers = headers,data = payload).text
print(b)

以上是我对本网站的代码。

https://mcfbd.com/mcf/FrmView_PropertyTaxStatus.aspx

我检查了firebug,这些是表单数据的值。 但这样做:

b = requests.post("https://mcfbd.com/mcf/FrmView_PropertyTaxStatus.aspx",headers = headers,data = payload).text
print(b)

抛出此错误:

[ArgumentException]: Invalid postback or callback argument

我对通过请求提交表单的理解是正确的吗?

1.打开萤火虫

2.提交表单

3.转到NET标签

4.在NET标签中选择帖子标签

5.copy表格数据,如上面的代码

我一直想知道如何做到这一点。我可以使用硒,但我想我会尝试新的东西并使用请求

1 个答案:

答案 0 :(得分:2)

您收到的错误是正确的,因为_VIEWSTATE(以及其他字段)等字段不是静态或硬编码的。正确的方法如下:

创建一个请求会话对象。此外,建议使用包含USER-AGENT字符串 -

的标头进行更新
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36",}`
s = requests.session()

导航到指定的网址 -

r = s.get(url)

使用BeautifulSoup4解析返回的html -

from bs4 import BeautifulSoup
soup = BeautifulSoup(r.content, 'html5lib')

使用硬编码值和动态值填充formdata -

formdata = {
   '__VIEWSTATE': soup.find('input', attrs={'name': '__VIEWSTATE'})['value'],
   'field1': 'value1'
}

然后使用会话对象本身发送POST请求 -

s.post(url, data=formdata)