使用urllib而不是http.client登录网站

时间:2017-07-11 01:15:53

标签: python python-3.x http post urllib

我正在尝试使用以下代码在Python中使用urllib登录网站:

import urllib.parse
import urllib.request
headers = {"Content-type": "application/x-www-form-urlencoded"}
payload = urllib.parse.urlencode({"username": "USERNAME-HERE",
                                  "password": "PASSWORD-HERE",
                                  "redirect": "index.php",
                                  "sid": "",
                                  "login": "Login"}).encode("utf-8")
request = urllib.request.Request("https://osu.ppy.sh/forum/ucp.php?mode=login", payload, headers)
response = urllib.request.urlopen(request)
data = response.read()

# print the HTML after the request
print(bytes(str(data), "utf-8").decode("unicode_escape"))

我知道一个常见的建议就是使用Requests库,我尝试过这个,但我特别想知道如何在不使用请求的情况下执行此操作。

可以使用以下使用http.client成功登录网站的代码复制我要查找的行为:

import urllib.parse
import http.client
headers = {"Content-type": "application/x-www-form-urlencoded"}
payload = urllib.parse.urlencode({"username": "USERNAME-HERE",
                                  "password": "PASSWORD-HERE",
                                  "redirect": "index.php",
                                  "sid": "",
                                  "login": "Login"})
conn = http.client.HTTPSConnection("osu.ppy.sh")
conn.request("POST", "/forum/ucp.php?mode=login", payload, headers)
response = conn.getresponse()
data = response.read()

# print the HTML after the request
print(bytes(str(data), "utf-8").decode("unicode_escape"))

在我看来,urllib代码不是"交付"有效负载,而http.client代码是。

我似乎能够"交付"有效负载,因为提供有缺陷的密码和用户名保证服务器的响应,但提供正确的用户名和密码似乎没有效果。

任何见解?我忽略了什么吗?

1 个答案:

答案 0 :(得分:4)

添加Cookie jar并在urllib

中取出不需要的标题
import http.cookiejar
import urllib.parse
import urllib.request

jar = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(jar))

payload = urllib.parse.urlencode({"username": "USERNAME-HERE",
                                  "password": "PASSWORD-HERE",
                                  "redirect": "index.php",
                                  "sid": "",
                                  "login": "Login"}).encode("utf-8")
response = opener.open("https://osu.ppy.sh/forum/ucp.php?mode=login", payload)
data = response.read()

# print the HTML after the request
print(bytes(str(data), "utf-8").decode("unicode_escape"))