Question

我已经编写了一个脚本，可以从xkcd漫画网站下载单个图像。但脚本通常运行，不下载任何图像。有什么问题？任何帮助，将不胜感激。这是代码：

#! python3


import requests, os, bs4

url = 'http://xkcd.com' # starting rule
os.makedirs('xkcd', exist_ok=True) # store comics in ./xkcd

    # Download the page
print('Downloading the page %s...' % url)
res = requests.get(url)
res.raise_for_status()

soup = bs4.BeautifulSoup(res.text)

# Find the URL of the comic image
comicElem = soup.select('#comic img')
if comicElem == []:
    print('Could not find comic image.')
else:
    try:
        comicURL = 'http:' + comicElem[0].get('src')
        # Download the image
        print('Downloading image %s...' % (comicURL))
        res = requests.get(comicURL)
        res.raise_for_status()
    except requests.exceptions.MissingSchema:
        # skip this comic
        prevLink = soup.select('a[rel="prev"]')[0]
        url = 'http://xkcd.com' + prevLink.get('href')
        # continue

        # Save the image to ./xkcd
        imageFile = open(os.path.join('xkcd', os.path.basename(comicURL)), 'wb')
        for chunk in res.iter_content(100000):
            imageFile.write(chunk)
        imageFile.close()

print('Done.')

Answer 1

您的问题是您将图像保存在异常块中，取消缩进。下载文件对象的简单方法是使用shutil。

import requests, os, bs4, shutil

url = 'http://xkcd.com' # starting rule
if not os.path.exists('xkcd'):
    os.makedirs('xkcd') # store comics in ./xkcd

    # Download the page
print('Downloading the page %s...' % url)
res = requests.get(url)
res.raise_for_status()

soup = bs4.BeautifulSoup(res.text)

# Find the URL of the comic image
comicElem = soup.select('#comic img')
if comicElem == []:
    print('Could not find comic image.')
else:
    comicURL = "http:"+comicElem[0].get('src')
    response = requests.get(comicURL, stream=True)
    with open('xkcd/img.png', 'wb') as out_file:
        shutil.copyfileobj(response.raw, out_file)
    del response   



print('Done.')

无法通过python脚本下载图像

1 个答案: