Question

我在使用Python登录网站时遇到问题。我只想登录该站点，然后获取您只能在登录时看到的页面的原始html，以便我可以使用BeautifulSoup解析它。我尝试在How to use Python to login to a webpage and retrieve cookies for later usage?使用答案，但似乎没有用。

我查看了使用LiveHeaders所需的POST数据，我认为我正确设置它但我的代码只返回登录页面。

任何人都知道我做错了什么？

import http.cookiejar
import urllib.request
import urllib.parse

username = 'username'
password = 'password'
_type = 'g'
vcode = ''

cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
login_data = urllib.parse.urlencode({'username' : username, 'password' : password, 'type' : _type, 'vcode': vcode})
login_data = login_data.encode('ascii')
opener.open('http://passthepopcorn.me/login.php', login_data)
resp = opener.open('http://passthepopcorn.me/requests.php')
print(resp.read())

Answer 1

这可能无法解答您的问题，但无论如何：我建议您使用“请求”模块（您必须使用pip install requests安装）而不是urllib。你几乎肯定会得到这样的代码：

    import requests

    username = 'username'
    password = 'password'
    _type = 'g'
    vcode = ''

    login_response = requests.post('http://passthepopcorn.me/login.php',
                                  {'username' : username,
                                   'password' : password,
                                   'type' : _type, 'vcode': vcode})

    gold = requests.get('http://passthepopcorn.me/requests.php',
                         cookies={'PHPSESSID': login_response.cookies['PHPSESSID']})

    print(gold.text)

这可能也不起作用，但它几乎肯定非常接近工作，而且这很容易理解。

使用Python 3.3和urllib登录网站时遇到问题

1 个答案: