Question

当您通常登录网站时，我们可以使用urllib2.Request。

import urllib2, base64
req = urllib2.Request("http://www.facebook.com/")
base64string = base64.encodestring("%s:%s" % ("username", "password")).replace("\n", "")
req.add_header("Authorization", "Basic %s" % base64string)
requested = urllib2.urlopen(req)

但是我们怎么知道你是否登录？因为你刚刚打开了一个错误授权的URL。

Answer 1

也许您应该查看requested.read（）以查看刚刚提取的页面。 :) 另请查看requested.info（）以获取服务器发送的标头。

你应该在try: ... except:中执行此操作以捕获错误。请参阅docs.python.org/2/howto/urllib2.html。

FWIW，现代方法是使用Requests模块。

修改

这是我几年前写的一些代码的摘录。

import urllib2 def post(url, params): txdata = urllib.urlencode(params) try: # create a request object req = urllib2.Request(url, txdata) # and open it to return a handle on the url handle = urllib2.urlopen(req) except IOError, e: print >>sys.stderr, 'We failed to open "%s".' % url if hasattr(e, 'code'): print >>sys.stderr, 'We failed with error code - %s.' % e.code elif hasattr(e, 'reason'): print >>sys.stderr, "The error object has the following 'reason' attribute :" print >>sys.stderr, e.reason print >>sys.stderr, "This usually means the server doesn't exist," print >>sys.stderr, "is down, or we don't have an internet connection." #raise SystemExit, 1 raise else: print >>sys.stderr, 'Here are the headers of the page :\n%s\n' % handle.info() true_url = handle.geturl() print >>sys.stderr, "\nTrue URL = '%s'\n" % true_url return true_url

我希望能给你一些想法。

编辑2

要处理cookie，只需在创建请求对象之前执行此操作：

# build opener with HTTPCookieProcessor cookie_handler = urllib2.HTTPCookieProcessor() opener = urllib2.build_opener(cookie_handler) urllib2.install_opener(opener)

Python - 如何知道您使用urllib2登录

1 个答案: