所以基本上我的想法是使用python登录网站并复制html页面的内容,只有在你登录后才能查看。(在https下)
有关如何实现这一目标的任何建议? 要求? http.client.HTTPSConnection?
我目前有
h1 = http.client.HTTPSConnection(URL) #question: what exactly should this url page be?
https://accounts.google.com/ServiceLoginhl=en&continue=https://www.google.ca/
or https://google.ca
userAndPass = b64encode(b"usrname:pwd").decode("ascii")
headers = { 'Authorization' : 'Basic %s' % userAndPass }
#then connect
h1.request('GET', '$THEPAGETHATIWANTTOACCESS', headers=headers)
非常感谢!
答案 0 :(得分:2)
您可以使用requests
r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
u'{"type":"User"...'
>>> r.json()
{u'private_gists': 419, u'total_private_repos': 77, ...}