所以我一直在尝试过去6个小时才能完成这项工作,但是我无法进行无休止的搜索并没有帮助,所以我想我要么做一些非常根本的错误,要么只是一件小事碰巧与我的逻辑相符的错误所以我需要额外的眼睛来帮助我修复它
网站网址为this
我写了一段凌乱的python代码来登录并阅读下一页,但是我得到的是一个令人讨厌的500错误,说服务器上的某些内容处理我的请求时出错了。
这是一个浏览器提出的请求,工作得很好,没问题
此请求的HTTP响应代码为302(重定向)
POST /appstatus/index.aspx HTTP/1.1
Host: www.wes.org
Connection: close
Content-Length: 303
Cache-Control: max-age=0
Origin: https://www.wes.org
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Referer: https://www.wes.org/appstatus/index.aspx
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.8,fa;q=0.6
Cookie: ASP.NET_SessionId=bu2gemmlh3hvp4f5lqqngrbp; _ga=GA1.2.1842963052.1473348318; _gat=1
__VIEWSTATE=%2FwEPDwUKLTg3MTMwMDc1NA9kFgICAQ9kFgICAQ8PFgIeBFRleHRkZGRk9rP20Uj9SdsjOKNUBlbw55Q01zI%3D&__VIEWSTATEGENERATOR=189D346C&__EVENTVALIDATION=%2FwEWBQK6lf6LBAKf%2B9bUAgK9%2B7qcDgK8w4S2BALowqJjoU1f0Cg%2FEAGU6r2IjpIPG8BO%2BiE%3D&txtUID=Email%40Removed.com&txtPWD=PASSWORDREMOVED&Submit=Log+In&Hidden1=
这是我的脚本发出的请求。
POST /appstatus/index.aspx HTTP/1.1
Host: www.wes.org
Connection: close
Accept-Encoding: gzip, deflate, br
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Upgrade-Insecure-Requests: 1
Content-Type: application/x-www-form-urlencoded
Origin: https://www.wes.org
Accept-Language: en-US,en;q=0.8,fa;q=0.6
Cache-Control: max-age=0
Referer: https://www.wes.org/appstatus/indexca.aspx
Cookie: ASP.NET_SessionId=nxotmb55jjwf5x4511rwiy45
Content-Length: 303
txtPWD=PASSWORDREMOVED&Submit=Log+In&__EVENTVALIDATION=%2FwEWBQK6lf6LBAKf%2B9bUAgK9%2B7qcDgK8w4S2BALowqJjoU1f0Cg%2FEAGU6r2IjpIPG8BO%2BiE%3D&txtUID=Email%40Removed.com&__VIEWSTATE=%2FwEPDwUKLTg3MTMwMDc1NA9kFgICAQ9kFgICAQ8PFgIeBFRleHRkZGRk9rP20Uj9SdsjOKNUBlbw55Q01zI%3D&Hidden1=&__VIEWSTATEGENERATOR=189D346C
这是提出请求的脚本,如果它太乱了,我很抱歉,只需要快速的东西。
import requests
import bs4
import urllib.parse
def main():
session = requests.Session()
headers = {"Origin": "https://www.wes.org",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"Cache-Control": "max-age=0", "Upgrade-Insecure-Requests": "1", "Connection": "close",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36",
"Referer": "https://www.wes.org/appstatus/indexca.aspx", "Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.8,fa;q=0.6", "Content-Type": "application/x-www-form-urlencoded"}
r = session.get('https://www.wes.org/appstatus/index.aspx',headers=headers)
cookies = r.cookies
soup = bs4.BeautifulSoup(r.content, "html5lib")
viewState=urllib.parse.quote(str(soup.select('#__VIEWSTATE')[0]).split('value="')[1].split('"/>')[0])
viewStateGenerator=urllib.parse.quote(str(soup.select('#__VIEWSTATEGENERATOR')[0]).split('value="')[1].split('"/>')[0])
eventValidation=urllib.parse.quote(str(soup.select('#__EVENTVALIDATION')[0]).split('value="')[1].split('"/>')[0])
paramsPost = {}
paramsPost.update({'__VIEWSTATE':viewState})
paramsPost.update({'__VIEWSTATEGENERATOR':viewStateGenerator})
paramsPost.update({'__EVENTVALIDATION':eventValidation})
paramsPost.update({"txtUID": "My@Email.Removed"})
paramsPost.update({"txtPWD": "My_So_Called_Password"})
paramsPost.update({"Submit": "Log In"})
paramsPost.update({"Hidden1": ""})
response = session.post("https://www.wes.org/appstatus/index.aspx", data=paramsPost, headers=headers,
cookies=cookies)
print("Status code:", response.status_code) #Outputs 500.
#print("Response body:", response.content)
if __name__ == '__main__':
main()
任何帮助都会非常感激。
答案 0 :(得分:0)
你正在做太多的工作并且这样做没有传递有效数据,你直接提取值属性,即.select_one('#__VIEWSTATEGENERATOR')["value"]
,并且对于所有其余的相同,在初始化之后将在Session对象中设置cookie。得到这样的逻辑归结为:
with requests.Session() as session:
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"}
r = session.get('https://www.wes.org/appstatus/index.aspx', headers=headers)
soup = bs4.BeautifulSoup(r.content, "html5lib")
viewState = soup.select_one('#__VIEWSTATE')["value"]
viewStateGenerator = soup.select_one('#__VIEWSTATEGENERATOR')["value"]
eventValidation = soup.select_one('#__EVENTVALIDATION')["value"]
paramsPost = {'__VIEWSTATE': viewState,'__VIEWSTATEGENERATOR': viewStateGenerator,
'__EVENTVALIDATION': eventValidation,"txtUID": "My@Email.Removed",
"txtPWD": "My_So_Called_Password",
"Submit": "Log In","Hidden1": ""}
response = session.post("https://www.wes.org/appstatus/index.aspx", data=paramsPost, headers=headers)
print("Status code:", response.status_code)
按照惯例,Python使用CamelCase作为类名,使用下划线使用下划线来分隔多个单词,您可能需要考虑将其应用于您的代码。