我正在尝试编写一个程序,下载所有xkcd漫画图像并将其保存在目录中,所有图像名称均为 title .png,title是漫画的标题。这是它的代码:
expandedRowRender
代码输出:
#Downloads all the xkcd comics
import requests, bs4, os
site = requests.get('https://www.xkcd.com')
def downloadImage(site):
soup = bs4.BeautifulSoup(site.text)
img_tag = soup.select('div[id="comic"] img')
img_title = img_tag[0].get('alt')
img_file = open(img_title+'.png', 'wb')
print("Downloading %s..." %img_title)
img_res = requests.get("https:" + img_tag[0].get('src'))
for chunk in img_res.iter_content(100000):
img_file.write(chunk)
print("Saved %s in " %img_title, os.getcwd())
def downloadPrevious(site):
soup = bs4.BeautifulSoup(site.text)
prev_tag_list = soup.select("ul[class='comicNav'] li > a")
prev_tag = None
for each in prev_tag_list:
if(each.get('rel')==['prev']):
prev_tag = each
break
if(prev_tag.get('href') == '#'):
return True
prev_site = requests.get('https://xkcd.com' + prev_tag.get('href'))
downloadImage(prev_site)
return False, prev_site
def download_XKCD_Comics(site):
try:
os.makedirs('E:\\XKCD Comics')
except:
os.chdir('E:\XKCD Comics')
done = False
downloadImage(site)
while(not done):
done, site = downloadPrevious(site)
return
download_XKCD_Comics(site)
我不明白问题所在。其他文件均不存在,但仅使用此文件名引发了错误。请有人帮我解决这个问题!
答案 0 :(得分:1)
/
是Windows文件名的无效字符。
有很多方法可以获取有效的文件名。一个示例是Django使用的示例:
def get_valid_filename(s):
s = str(s).strip().replace(' ', '_')
return re.sub(r'(?u)[^-\w.]', '', s)
它用下划线替换空格,然后删除所有非字母,数字,_,-或。字符。