对.aspx页面的Python顺序请求

时间:2015-09-09 10:52:21

标签: python asp.net post python-requests

尝试&在请求POST到.aspx页面后无法呈现页面。我已经开发了另一种Selenium webdriver解决方案,但是想了解使用Requests的帖子失败的原因。我启动GET来收集页面参数VIEWSTATE等,然后选中“导出类型”复选框。 html显示页面重新加载新闻“Data to Export”复选框,但是当选择其中一个选项进行POSTing时,将呈现默认的基本URL页面。我们将非常感谢有关传播第二个请求失败原因的任何帮助。请求序列的目的是在预定义的日期之间下载“计划的不可用生成”xml。

import requests, csv, time, json, codecs
from datetime import datetime
from datetime import timedelta, date
from io import BytesIO,TextIOWrapper
import pandas as pd
import xml.etree.ElementTree as etree
import matplotlib.pyplot as plt
from bs4 import BeautifulSoup
import subprocess

def print_full(x):
    pd.set_option('display.max_rows', len(x))
    print(x)
    pd.reset_option('display.max_rows')


url ='http://energieinfo.tennet.org/dataexport/exporteerdatacountry.aspx'

headers={

'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Encoding':'gzip, deflate',
'Accept-Language':'en-GB,en;q=0.5',
'Content-Type':'application/x-www-form-urlencoded',
'Host':'energieinfo.tennet.org',
'Origin':'http://energieinfo.tennet.org',
'Proxy-Connection':'keep-alive',
'Referer':'http://energieinfo.tennet.org/dataexport/exporteerdatacountry.aspx',
'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0'}

payload = {}
s = requests.Session()
r = s.get(url=url)

headers['set-cookie'] = r.headers['set-cookie']

print (headers)
#headers['Content-Length'] = r.headers['Content-Length']

soup = BeautifulSoup(r.text)
viewstate_tag = soup.find('input', attrs={"type" : "hidden", "name":"__VIEWSTATE"})
viewstategen_tag = soup.find('input', attrs={"type" : "hidden", "name":"__VIEWSTATEGENERATOR"})
eventvalidation_tag = soup.find('input', attrs={"type" : "hidden", "name":"__EVENTVALIDATION"})

payload[viewstate_tag['name']] = viewstate_tag['value']
payload[viewstategen_tag['name']] = viewstategen_tag['value']
payload[eventvalidation_tag['name']] = eventvalidation_tag['value']
payload['__EVENTTARGET'] =  'ctl00$MainContentPlaceHolder$ExportData$rblSelection$3'
payload['__EVENTARGUMENT'] = ''
payload['__LASTFOCUS'] = ''
payload['ctl00$MainContentPlaceHolder$ExportData$rblSelection']= '3'
payload['ctl00$MainContentPlaceHolder$ExportData$tbDateFrom']=''
payload['ctl00$MainContentPlaceHolder$ExportData$tbDateUntil']=''

data = json.dumps(payload).encode()
#First POST request to load 'Data to Export' checkboxes - this bit works
r = s.post(url=url,data=payload,headers=headers)

#headers['Content-Length'] = r.headers['Content-Length']

with open("requests_results.html", "w") as f:
        f.write(r.text)

payload = {}
soup = BeautifulSoup(r.text)
viewstate_tag = soup.find('input', attrs={"type" : "hidden", "name":"__VIEWSTATE"})
viewstategen_tag = soup.find('input', attrs={"type" : "hidden", "name":"__VIEWSTATEGENERATOR"})
eventvalidation_tag = soup.find('input', attrs={"type" : "hidden", "name":"__EVENTVALIDATION"})

payload[viewstate_tag['name']] = viewstate_tag['value']
payload[viewstategen_tag['name']] = viewstategen_tag['value']
payload[eventvalidation_tag['name']] = eventvalidation_tag['value']
payload['__EVENTTARGET'] ='ctl00$MainContentPlaceHolder$ExportData$cb_VNBProd'
payload['__EVENTARGUMENT'] = ''
payload['__LASTFOCUS'] = ''
payload['ctl00$MainContentPlaceHolder$ExportData$rblSelection']= '3'
payload['ctl00$MainContentPlaceHolder$ExportData$cb_VNBProd']='on'
# payload['ctl00$MainContentPlaceHolder$ExportData$tbDateFrom']='2012/01/01'
# payload['ctl00$MainContentPlaceHolder$ExportData$tbDateUntil']='2018/01/01'
# payload['ctl00$MainContentPlaceHolder$ExportData$btnSubmitDate']='Commit'

# for item , value in payload.items():
#     print(item,value)

data = json.dumps(payload).encode()
#Second POST request to load 'Planned unavailability of generation' checkboxes - this bit only returns the base url page

r = s.post(url=url,data=data,headers=headers)

with open("requests_results2.html", "w") as f:
        f.write(r.text)

1 个答案:

答案 0 :(得分:0)

您对Cookie的使用看起来很可疑。

使用requests.session时,您不需要将Cookie从初始响应中复制到以下请求中 - 它们将自动发送。这是会话的功能之一。

在任何情况下都没有正确复制cookie。后续请求应该在初始响应中重新呈现服务器给出的cookie,但是您的代码尝试通过发送set-cookie标头来设置 cookie。相反,它应该是cookie标题,例如

Cookie: ASP.NET_SessionId=cbxwegjj2jq1mwgr2ybdo0jj

但是,如上所述,requests.session将为您处理。