使用请求登录网站

时间:2017-01-26 21:11:54

标签: python python-3.x cookies python-requests

我目前正在尝试从http://www.spotrac.com/获取需要登录的数据。我当前的尝试使用此代码(我通过对类似主题的一堆其他堆栈溢出问题获得)

from bs4 import BeautifulSoup as bs
from requests import session

payload = {
    'id': 'contactForm',
    'cmd': 'http://www.spotrac.com/signin/submit/',
    'email': '*****',
    'password': '*****'
}

with session() as c:
    r_login = c.post('http://www.spotrac.com/signin/', data=payload)

    print(r_login.headers)
    response = c.get('http://www.spotrac.com/nba/cleveland-cavaliers/lebron-james')
    print(response.cookies)
    soup=bs(response.text, 'html.parser')
    with open('ex.html','w') as f:
        f.write(soup.prettify())

我当前的代码做的一切都正确,除了我在发出请求时没有登录。

由于

1 个答案:

答案 0 :(得分:1)

您正在向错误的网址发送POST请求,并且还有不正确的有效负载。

POST http://www.spotrac.com/signin/submit/ HTTP/1.1
Host: www.spotrac.com
Connection: keep-alive
Content-Length: 86
Cache-Control: max-age=0
Origin: http://www.spotrac.com
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Referer: http://www.spotrac.com/signin/
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.8
Cookie: cisession=a%3A5%3A%7Bs%3A10%3A%22session_id%22%3Bs%3A32%3A%2206021e191bdbbaf955f111f67b961056%22%3Bs%3A10%3A%22ip_address%22%3Bs%3A11%3A%22119.9.105.6%22%3Bs%3A10%3A%22user_agent%22%3Bs%3A108%3A%22Mozilla%2F5.0+%28Windows+NT+6.1%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F55.0.2883.87+Safari%2F537.36%22%3Bs%3A13%3A%22last_activity%22%3Bi%3A1485487245%3Bs%3A9%3A%22user_data%22%3Bs%3A0%3A%22%22%3B%7Dd6089620b21ecce6837161605055ae04; _ga=GA1.2.910256341.1481865346; _gali=contactForm

redirect=http%3A%2F%2Fwww.spotrac.com%2F&email=sdfs%40gmail.com&password=lkasjdflksjad
HTTP/1.1 302 Found
Server: nginx
Date: Fri, 27 Jan 2017 04:21:16 GMT
Content-Type: text/html
Content-Length: 0
Connection: keep-alive
Set-Cookie: cisession=a%3A5%3A%7Bs%3A10%3A%22session_id%22%3Bs%3A32%3A%22badb1275aee1cdad6736a6b4bb1ce809%22%3Bs%3A10%3A%22ip_address%22%3Bs%3A11%3A%22119.9.105.6%22%3Bs%3A10%3A%22user_agent%22%3Bs%3A108%3A%22Mozilla%2F5.0+%28Windows+NT+6.1%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F55.0.2883.87+Safari%2F537.36%22%3Bs%3A13%3A%22last_activity%22%3Bi%3A1485490876%3Bs%3A9%3A%22user_data%22%3Bs%3A0%3A%22%22%3B%7Dad486866c32cac526487707cea85b8a9; expires=Fri, 10-Feb-2017 04:21:16 GMT; path=/
Location: http://www.spotrac.com/register/
X-Powered-By: PleskLin
MS-Author-Via: DAV

从上面的会话中可以看出,正确的网址应为http://www.spotrac.com/signin/submit/,有效负载字符串为redirect=http%3A%2F%2Fwww.spotrac.com%2F&email=sdfs%40gmail.com&password=lkasjdflksjad,基本上是:

payload = {'redirect': 'http://www.spotrac.com/', 
           'email': mail_address, 
           'password': password}

另外,请确保使用正确的参数模拟headers,然后您就可以了。