我正在使用Python 3.3和Requests库来执行基本的POST请求。
我想模拟如果您从网页手动输入信息到浏览器中会发生什么:https://capp.arlingtonva.us/tap/AC_xwTapPay.aspx。例如,尝试输入“2.停车票”,单击下一步,输入1234作为板号,输入弗吉尼亚作为状态,然后单击下一步,然后选中复选框并单击下一步。
虽然网址相同,但输入信息并点击下一步会有多次迭代。
目前,我正在对URL进行GET以获取随机生成的字符串,例如源代码中的“__EVENTVALIDATION”和“__VIEWSTATE”的值。然后我用这些信息以及其他一些信息进行POST。
我是否在代码中使用了正确的后期有效负载?
我的代码是:
import requests
url = r'https://capp.arlingtonva.us/tap/AC_xwTapPay.aspx'
#GET request
s = requests.Session()
r = s.get(url)
text1 = r.text
#getting "__EVENTVALIDATION" value:
eventvalstartstring = r'id="__EVENTVALIDATION" value="'
eventvalstart = text1.find(eventvalstartstring)+len(eventvalstartstring)
end_ind = text1.find('"',eventvalstart)
eventvalidation_string = text1[eventvalstart:end_ind]
#getting "__VIEWSTATE" value:
viewstate_start_string= 'id="__VIEWSTATE" value="'
viewstate_start = text1.find(viewstate_start_string)+len(viewstate_start_string)
end_ind2 = text1.find('"',viewstate_start)
viewstate_string = text1[viewstate_start:end_ind2]
#POST request
payload = {"AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:BillType":"PKT",
"__EVENTTARGET":"",
"__EVENTARGUMENT":"",
"__LASTFOCUS":"",
"__VIEWSTATE":viewstate_string,
"__VIEWSTATEGENERATOR":"C0C9F6BC",
"__VIEWSTATEENCRYPTED":"",
"__EVENTVALIDATION":eventvalidation_string,
"AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:TagState":'VA',
"AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:TagNumber":'1234',
"AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:Next1":"Next >",
"AC_xwTapCtl:scrollTop":'0',
"AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:Next2":"Next >",
"AC_xwTapCtl:xwTap_txtFocus":"AC_xwTapCtl_AC_xwTapCtlCtl.xuWrqCtl_Next1",
"AC_xwTapCtl_scrollTop":'0',
"AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:Next3":"Next >",
"AC_xwTapCtl:xwTap_txtFocus":"AC_xwTapCtl_AC_xwTapCtlCtl.xuWrqCtl_Next2",
"AC_xwTapCtl_scrollTop":"0"}
post = s.post(url, data=payload)
text = post.text
谢谢,-K。
答案 0 :(得分:2)
在这个阶段,我可能会转而使用beautifulsoup(pip install BeautifulSoup4)来解析html,以便更容易地获取所有数据。因为它是.NET(我认为),整个页面都有一个表单,所以我们可以抓住所有输入。
import requests
from bs4 import BeautifulSoup
s = requests.Session()
r = s.get('https://capp.arlingtonva.us/tap/AC_xwTapPay.aspx')
soup = BeautifulSoup(r.text)
# grab out all the fields
payload = {i['name']:i.get('value') for i in soup.findAll('input')}
# populate the select field
payload['AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:BillType'] = 'PKT'
# and submit the next step
r = s.post('https://capp.arlingtonva.us/tap/AC_xwTapPay.aspx', data=payload)
# then parse / build next request etc
soup = BeautifulSoup(r.text)
payload = {i['name']:i.get('value') for i in soup.findAll('input')}
payload['AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:TagState'] = 'VA'
payload['AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:TagNumber'] = 'blah'
r = s.post('https://capp.arlingtonva.us/tap/AC_xwTapPay.aspx', data=payload)
# rinse and repeat as many times as required...
soup = BeautifulSoup(r.text)