Python新手,我尝试自动从Google下载图片。我想输入一个关键字,然后让我的程序自动下载/将图像从Google下载/保存到文件夹中,以便在我的计算机上可用。这是我的代码:
import json
import os
import time
import requests
from PIL import Image
from StringIO import StringIO
from requests.exceptions import ConnectionError
def go(query, path):
BASE_URL = 'https://ajax.googleapis.com/ajax/services/search/images?'\
'v=1.0&q=' + query + '&start=%d'
BASE_PATH = os.path.join(path, query)
if not os.path.exists(BASE_PATH):
os.makedirs(BASE_PATH)
start = 0 # Google's start query string parameter for pagination.
while start < 60: # Google will only return a max of 56 results.
r = requests.get(BASE_URL % start)
for image_info in json.loads(r.text)['responseData']['results']:
url = image_info['unescapedUrl']
try:
image_r = requests.get(url)
except ConnectionError, e:
print 'could not download %s' % url
continue
# Remove file-system path characters from name.
title = image_info['titleNoFormatting'].replace('/', '').replace('\\', '')
file = open(os.path.join(BASE_PATH, '%s.jpg') % title, 'w')
try:
Image.open(StringIO(image_r.content)).save(file, 'JPEG')
except IOError, e:
# Throw away some gifs
print 'could not save %s' % url
continue
finally:
file.close()
print start
start += 4 # 4 images per page.
time.sleep(1.5)
go(&#39;愤怒的人脸&#39;,&#39; myDirectory&#39;)
但我不断收到错误说:
file = open(os.path.join(BASE_PATH, '%s.jpg') % title, 'w')
IOError: [Errno 22] invalid mode ('w') or
filename: u'myDirectory\\landscape\\Nature - Landscapes - Views - Desktop Wallpapers | MIRIADNA..jpg'
我该怎么做才能解决这个问题?请帮忙!对此,我真的非常感激。
答案 0 :(得分:1)
filename: u'... - Desktop Wallpapers | MIRIADNA..jpg'
^ This is a problem
Windows不允许文件名中的管道符(|
)。
来自http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx:
以下保留字符:
- &LT; (小于)
- &GT; (大于)
- :(冒号)
- “(双引号)
- /(正斜线)
- \(反斜杠)
- | (竖杆或竖管)
- ? (问号)
- *(星号)
在您的情况下,保留字符出现在您正在下载的图片的标题中,随后用于您的文件名。您可以非常轻松地删除这些字符,例如:
title = ''.join('%s' % lett for lett in [let for let in title if let not in '<>:"/\|?*'])