Python使用验证令牌请求GET和POST到网站

时间:2014-12-16 21:34:03

标签: python post get token python-requests

我正在使用Python 3.3和Requests库来执行基本的POST请求。

我想模拟如果您从网页手动输入信息到浏览器中会发生什么:https://capp.arlingtonva.us/tap/AC_xwTapPay.aspx。例如,尝试输入“2.停车票”,单击下一步,输入1234作为板号,输入弗吉尼亚作为状态,然后单击下一步,然后选中复选框并单击下一步。

虽然网址相同,但输入信息并点击下一步会有多次迭代。

目前,我正在对URL进行GET以获取随机生成的字符串,例如源代码中的“__EVENTVALIDATION”和“__VIEWSTATE”的值。然后我用这些信息以及其他一些信息进行POST。

我是否在代码中使用了正确的后期有效负载?

我的代码是:

import requests
url = r'https://capp.arlingtonva.us/tap/AC_xwTapPay.aspx'

#GET request
s = requests.Session()
r = s.get(url)
text1 = r.text

#getting "__EVENTVALIDATION" value:
eventvalstartstring = r'id="__EVENTVALIDATION" value="'
eventvalstart = text1.find(eventvalstartstring)+len(eventvalstartstring)
end_ind = text1.find('"',eventvalstart)
eventvalidation_string = text1[eventvalstart:end_ind]

#getting "__VIEWSTATE" value:
viewstate_start_string= 'id="__VIEWSTATE" value="'
viewstate_start = text1.find(viewstate_start_string)+len(viewstate_start_string)
end_ind2 = text1.find('"',viewstate_start)
viewstate_string = text1[viewstate_start:end_ind2]

#POST request
payload = {"AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:BillType":"PKT",
           "__EVENTTARGET":"",
           "__EVENTARGUMENT":"",
           "__LASTFOCUS":"",
           "__VIEWSTATE":viewstate_string,
           "__VIEWSTATEGENERATOR":"C0C9F6BC",
           "__VIEWSTATEENCRYPTED":"",
           "__EVENTVALIDATION":eventvalidation_string,
           "AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:TagState":'VA',
           "AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:TagNumber":'1234',
           "AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:Next1":"Next >",
           "AC_xwTapCtl:scrollTop":'0',
           "AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:Next2":"Next >",
           "AC_xwTapCtl:xwTap_txtFocus":"AC_xwTapCtl_AC_xwTapCtlCtl.xuWrqCtl_Next1",
           "AC_xwTapCtl_scrollTop":'0',
           "AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:Next3":"Next >",
           "AC_xwTapCtl:xwTap_txtFocus":"AC_xwTapCtl_AC_xwTapCtlCtl.xuWrqCtl_Next2",
           "AC_xwTapCtl_scrollTop":"0"}

post = s.post(url, data=payload)
text = post.text

谢谢,-K。

1 个答案:

答案 0 :(得分:2)

在这个阶段,我可能会转而使用beautifulsoup(pip install BeautifulSoup4)来解析html,以便更容易地获取所有数据。因为它是.NET(我认为),整个页面都有一个表单,所以我们可以抓住所有输入。

import requests
from bs4 import BeautifulSoup

s = requests.Session()

r = s.get('https://capp.arlingtonva.us/tap/AC_xwTapPay.aspx')
soup = BeautifulSoup(r.text)

# grab out all the fields
payload = {i['name']:i.get('value') for i in soup.findAll('input')}
# populate the select field
payload['AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:BillType'] = 'PKT'

# and submit the next step
r = s.post('https://capp.arlingtonva.us/tap/AC_xwTapPay.aspx', data=payload)    

# then parse / build next request etc
soup = BeautifulSoup(r.text)
payload = {i['name']:i.get('value') for i in soup.findAll('input')}
payload['AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:TagState'] = 'VA'
payload['AC_xwTapCtl:AC_xwTapCtlCtl.xuWrqCtl:TagNumber'] = 'blah'
r = s.post('https://capp.arlingtonva.us/tap/AC_xwTapPay.aspx', data=payload)    

# rinse and repeat as many times as required...
soup = BeautifulSoup(r.text)