当您通常登录网站时,我们可以使用urllib2.Request。
import urllib2, base64
req = urllib2.Request("http://www.facebook.com/")
base64string = base64.encodestring("%s:%s" % ("username", "password")).replace("\n", "")
req.add_header("Authorization", "Basic %s" % base64string)
requested = urllib2.urlopen(req)
但是我们怎么知道你是否登录?因为你刚刚打开了一个错误授权的URL。
答案 0 :(得分:1)
也许您应该查看requested.read()以查看刚刚提取的页面。 :) 另请查看requested.info()以获取服务器发送的标头。
你应该在try: ... except:
中执行此操作以捕获错误。请参阅docs.python.org/2/howto/urllib2.html。
FWIW,现代方法是使用Requests模块。
修改强>
这是我几年前写的一些代码的摘录。
import urllib2
def post(url, params):
txdata = urllib.urlencode(params)
try:
# create a request object
req = urllib2.Request(url, txdata)
# and open it to return a handle on the url
handle = urllib2.urlopen(req)
except IOError, e:
print >>sys.stderr, 'We failed to open "%s".' % url
if hasattr(e, 'code'):
print >>sys.stderr, 'We failed with error code - %s.' % e.code
elif hasattr(e, 'reason'):
print >>sys.stderr, "The error object has the following 'reason' attribute :"
print >>sys.stderr, e.reason
print >>sys.stderr, "This usually means the server doesn't exist,"
print >>sys.stderr, "is down, or we don't have an internet connection."
#raise SystemExit, 1
raise
else:
print >>sys.stderr, 'Here are the headers of the page :\n%s\n' % handle.info()
true_url = handle.geturl()
print >>sys.stderr, "\nTrue URL = '%s'\n" % true_url
return true_url
我希望能给你一些想法。
编辑2
要处理cookie,只需在创建请求对象之前执行此操作:
# build opener with HTTPCookieProcessor
cookie_handler = urllib2.HTTPCookieProcessor()
opener = urllib2.build_opener(cookie_handler)
urllib2.install_opener(opener)