处理身份验证从经过身份验证的网站检索图像通过urllib2

时间:2014-07-18 20:14:23

标签: image authentication save urllib2 urlopen

我正在尝试创建一个小型API,通过Web登录到内部监控工具,并使用我指定的登录凭据检索我指定的页面上的图像。在它已经构建了URL之后,它没有将任何身份验证传递给最后一节。构建URL后,将其放入变量y。然后我尝试打开y并保存它,这是发生身份验证问题的地方。请向下滚动以查看示例。

import urllib
import urllib2
import lxml, lxml.html



ID = raw_input("Enter ID:")

PorS = raw_input("Enter 1 for Primary 2 for Secondary:")

base_link = 'http://statseeker/cgi/nim-report?rid=42237&command=Graph&mode=ping&list=jc-'

dash = '-'

end_link =  '&tfc_fav=range+%3D+start_of_today+-+1d+to+now%3B&year=&month=&day=&hour=&minute=&duration=&wday_from=&wday_to=&time_from=&time_to=&tz=America%2FChicago&tfc=range+%3D+start_of_today+-+1d+to+now%3B&rtype=Delay&graph_type=Filled&db_type=Average&x_step=1&interval=60&y_height=100&y_gridlines=5&y_max=&y_max_power=1&x_gridlines=on&legend=on'


urlauth = base_link + jConnectID + dash + jConnectPorS + end_link

print urlauth

realm = 'statseeker'
username = 'admin'
password = '*****'
auth_handler = urllib2.HTTPBasicAuthHandler()
auth_handler.add_password(realm, urlauth, username, password)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
data = opener.open(urlauth).read()

html = lxml.html.fromstring(data)
imgs = html.cssselect('img.graph')
for x in imgs:
   y = 'http://statseeker%s' % (x.attrib['src'])
   g = urllib2.urlopen(y).read()
   urllib2.urlopen(test.jpg, 'wb').write(g)
   print 'http://statseeker%s' % (x.attrib['src'])
   with open('statseeker.html', 'a') as f:
    f.write(y)

结果:

C:\Users\user\Documents\Scripting>python test.py
    Enter ID:4050
    Enter 1 for Primary 2 for Secondary:1
    http://statseeker/cgi/nim-report?rid=42237&command=Graph&mode=ping&list=jc-4050-
    1&tfc_fav=range+%3D+start_of_today+-+1d+to+now%3B&year=&month=&day=&hour=&minute
    =&duration=&wday_from=&wday_to=&time_from=&time_to=&tz=America%2FChicago&tfc=ran
    ge+%3D+start_of_today+-+1d+to+now%3B&rtype=Delay&graph_type=Filled&db_type=Avera
    ge&x_step=1&interval=60&y_height=100&y_gridlines=5&y_max=&y_max_power=1&x_gridli
    nes=on&legend=on
    Traceback (most recent call last):
      File "JacksonShowAndSave.py", line 35, in <module>
        g = urllib2.urlopen(y).read()
      File "C:\Python27\lib\urllib2.py", line 127, in urlopen
        return _opener.open(url, data, timeout)
      File "C:\Python27\lib\urllib2.py", line 410, in open
        response = meth(req, response)
      File "C:\Python27\lib\urllib2.py", line 523, in http_response
        'http', request, response, code, msg, hdrs)
      File "C:\Python27\lib\urllib2.py", line 448, in error
        return self._call_chain(*args)
      File "C:\Python27\lib\urllib2.py", line 382, in _call_chain
        result = func(*args)
      File "C:\Python27\lib\urllib2.py", line 531, in http_error_default
        raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
    urllib2.HTTPError: HTTP Error 401: Authorization Required

    C:\Users\user\Documents\Scripting>

如何修复这些错误以将身份验证传递到该网站上打开的任何网页?我认为任何urlopen请求都将使用与上面相同的身份验证。

失败的部分:

 y = 'http://statseeker%s' % (x.attrib['src'])
       g = urllib2.urlopen(y).read()
       urllib2.urlopen(test.jpg, 'wb').write(g)
       print 'http://statseeker%s' % (x.attrib['src'])
       with open('statseeker.html', 'a') as f:

1 个答案:

答案 0 :(得分:0)

我为请求构建了另一个auth处理程序:

auth_handler2 = urllib2.HTTPBasicAuthHandler()
auth_handler2.add_password(realm, y, username, password)
opener2 = urllib2.build_opener(auth_handler2)
urllib2.install_opener(opener2)
link = urllib2.Request(y)
response = urllib2.urlopen(link)
output = open('out2.jpg','wb')
output.write(response.read())
output.close()