Python中“ wb”文件模式下的FileNotFoundError?

时间:2018-10-05 02:34:20

标签: python web-scraping beautifulsoup python-requests

我正在尝试编写一个程序,下载所有xkcd漫画图像并将其保存在目录中,所有图像名称均为 title .png,title是漫画的标题。这是它的代码:

expandedRowRender

代码输出:

#Downloads all the xkcd comics

import requests, bs4, os

site = requests.get('https://www.xkcd.com')

def downloadImage(site):
    soup = bs4.BeautifulSoup(site.text)
    img_tag = soup.select('div[id="comic"] img')
    img_title = img_tag[0].get('alt')
    img_file = open(img_title+'.png', 'wb')
    print("Downloading %s..." %img_title)
    img_res = requests.get("https:" +  img_tag[0].get('src'))
    for chunk in img_res.iter_content(100000):
        img_file.write(chunk)
    print("Saved %s in " %img_title, os.getcwd())


def downloadPrevious(site):
    soup = bs4.BeautifulSoup(site.text)
    prev_tag_list = soup.select("ul[class='comicNav'] li > a")
    prev_tag = None
    for each in prev_tag_list:
        if(each.get('rel')==['prev']):
            prev_tag = each
            break
    if(prev_tag.get('href') == '#'):
        return True
    prev_site = requests.get('https://xkcd.com' + prev_tag.get('href'))
    downloadImage(prev_site)
    return False, prev_site

def download_XKCD_Comics(site):
    try:
        os.makedirs('E:\\XKCD Comics')
    except:
        os.chdir('E:\XKCD Comics')

    done = False
    downloadImage(site)
    while(not done):
        done, site = downloadPrevious(site)
    return

download_XKCD_Comics(site)

我不明白问题所在。其他文件均不存在,但仅使用此文件名引发了错误。请有人帮我解决这个问题!

1 个答案:

答案 0 :(得分:1)

/是Windows文件名的无效字符。

有很多方法可以获取有效的文件名。一个示例是Django使用的示例:

def get_valid_filename(s):
    s = str(s).strip().replace(' ', '_')
    return re.sub(r'(?u)[^-\w.]', '', s)

它用下划线替换空格,然后删除所有非字母,数字,_,-或。字符。