Question

我试图下载带有更改但却出错的网址的图片。

url_image="http://www.joblo.com/timthumb.php?src=/posters/images/full/"+str(title_2)+"-poster1.jpg&h=333&w=225"

user_agent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)'
headers = {'User-Agent': user_agent}
req = urllib.request.Request(url_image, None, headers)


print(url_image)
#image, h = urllib.request.urlretrieve(url_image)
with urllib.request.urlopen(req) as response:
    the_page = response.read()

#print (the_page)


with open('poster.jpg', 'wb') as f:
    f.write(the_page)

追踪（最近一次通话）：文件＆＃34; C：\ Users \ luke \ Desktop \ scraper \ imager finder.py＆＃34;，第97行，在使用urllib.request.urlopen（req）作为响应：文件＆＃34; C：\ Users \ luke \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ urllib \ request.py＆＃34;，第162行，在urlopen中 return opener.open（url，data，timeout）文件＆＃34; C：\ Users \ luke \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ urllib \ request.py＆＃34;，第465行，打开 response = self._open（req，data）文件＆＃34; C：\ Users \ luke \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ urllib \ request.py＆＃34;，第483行，在_open中＆＃39; _open＆＃39;，req）文件＆＃34; C：\ Users \ luke \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ urllib \ request.py＆＃34;，第443行，_call_chain result = func（* args）文件＆＃34; C：\ Users \ luke \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ urllib \ request.py＆＃34;，第1268行，在http_open中 return self.do_open（http.client.HTTPConnection，req）文件＆＃34; C：\ Users \ luke \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ urllib \ request.py＆＃34;，第1243行，在do_open中 r = h.getresponse（）文件＆＃34; C：\ Users \ luke \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ http \ client.py＆＃34;，第1174行，在getresponse中 response.begin（）文件＆＃34; C：\ Users \ luke \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ http \ client.py＆＃34;，第282行，开头版本，状态，原因= self._read_status（）文件＆＃34; C：\ Users \ luke \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ http \ client.py＆＃34;，第264行，在_read_status中提高BadStatusLine（线） http.client.BadStatusLine：

Answer 1

我的建议是使用urlib2。另外，我已经编写了一个很好的函数（我认为），如果服务器支持它，它还将允许gzip编码（减少带宽）。我使用它来下载社交媒体文件，但应该适用于任何事情。

我会尝试调试您的代码，但由于它只是一个片段（并且错误格式错误），因此很难确切地知道您的错误发生的位置（＆＃39;肯定不是你的代码片段中的第97行。

这并非尽可能短，但它清晰且可重复使用。这是python 2.7，看起来你正在使用3 - 在这种情况下你会谷歌一些其他问题来解决如何在python 3中使用urllib2。

import urllib2
import gzip
from StringIO import StringIO

def download(url):
    """
    Download and return the file specified in the URL; attempt to use
    gzip encoding if possible.
    """
    request = urllib2.Request(url)
    request.add_header('Accept-Encoding', 'gzip')
    try:
        response = urllib2.urlopen(request)
    except Exception, e:
        raise IOError("%s(%s) %s" % (_ERRORS[1], url, e))
    payload = response.read()
    if response.info().get('Content-Encoding') == 'gzip':
        buf = StringIO(payload)
        f = gzip.GzipFile(fileobj=buf)
        payload = f.read()
    return payload

def save_media(filename, media):
    file_handle = open(filename, "wb")
    file_handle.write(media)
    file_handle.close()

title_2 = "10-cloverfield-lane"
media = download("http://www.joblo.com/timthumb.php?src=/posters/images/full/{}-poster1.jpg&h=333&w=225".format(title_2))
save_media("poster.jpg", media)

Python使用alernating变量下载图像

1 个答案: