使用谷歌自定义搜索API下载图像

时间:2014-04-04 15:12:53

标签: python json google-app-engine google-custom-search

我在python中使用了google image api,使用以下代码下载了20个第一个图像结果:

import os
import sys
import time
from urllib import FancyURLopener
import urllib2
import simplejson



searchTerm = "Cat"

# Replace spaces ' ' in search term for '%20' in order to comply with request
searchTerm = searchTerm.replace(' ','%20')



# Start FancyURLopener with defined version 
class MyOpener(FancyURLopener): 
    version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'
myopener = MyOpener()

# Set count to 0
count=0

for i in range(0,4):
    # Notice that the start changes for each iteration in order to request a new set of images for each loop
  url = ('https://ajax.googleapis.com/ajax/services/search/images?'+'v=1.0&q='+searchTerm7+'&start='+str(i*4)+'&userip=MyIP&imgsz=xlarge|xxlarge|huge')
  print url
  request = urllib2.Request(url, None, {'Referer': 'testing'})
  response = urllib2.urlopen(request)

    # Get results using JSON
  results = simplejson.load(response)
  data = results['responseData']
  dataInfo = data['results']

    # Iterate for each result and get unescaped url
  for myUrl in dataInfo:
    count = count + 1
    print myUrl['unescapedUrl']
    os.chdir(newpath)
    myopener.retrieve(myUrl['unescapedUrl'],str(num)+'-'+str(count))

    # Sleep for one second to prevent IP blocking from Google
    time.sleep(3)

但是现在我想使用谷歌自定义搜索来做到这一点,以获得更好的结果。我知道我应该注册以获得APIKey但我确实找到了任何简单的例子作为我发布的代码。有人可以提供帮助,我真的迷失在谷歌文档中。

显然有限制免费api,每天100个请求,这是正确的吗?

编辑:我现在在这里,但仍然没有工作

import os
import sys
import time
from urllib import FancyURLopener
import urllib2
import simplejson
import cStringIO
import pprint


searchTerm="Cat"

# Start FancyURLopener with defined version 
class MyOpener(FancyURLopener): 
    version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'
myopener = MyOpener()


url='https://www.googleapis.com/customsearch/v1?key=API_KEY&cx=017576662512468239146:omuauf_lfve'+'&q='+searchTerm+'&searchType=image'+'&start=0'+'&imgSize=xlarge|xxlarge|huge'
print url
request = urllib2.Request(url, None, {'Referer': 'testing'})
response = urllib2.urlopen(request)

    # Get results using JSON

data = json.load(response)
pprint.PrettyPrinter(indent=4).pprint(data['items'][0]) 

2 个答案:

答案 0 :(得分:18)

您可以使用此Google APIs Client Library for Python。

<强>演示:

Here是一个示例(我将其更改为):

from apiclient.discovery import build

service = build("customsearch", "v1",
               developerKey="** your developer key **")

res = service.cse().list(
    q='butterfly',
    cx=' ** your cx **',
    searchType='image',
    num=3,
    imgType='clipart',
    fileType='png',
    safe= 'off'
).execute()

if not 'items' in res:
    print 'No result !!\nres is: {}'.format(res)
else:
    for item in res['items']:
        print('{}:\n\t{}'.format(item['title'], item['link']))

<强>输出:

Clipart - Butterfly:
        http://openclipart.org/image/800px/svg_to_png/3965/jonata_Butterfly.png
Animal, Butterfly, Insect, Nature - Free image - 158831:
        http://pixabay.com/static/uploads/photo/2013/07/13/11/51/animal-158831_640.png
Clipart - Monarch Butterfly:
        http://openclipart.org/image/800px/svg_to_png/110023/Monarch_Butterfly_by_Merlin2525.png

是的,Free版本有限制,您可以从Google开发者控制台monitor it获取:

here

注意:

转到Custom Search Engine,然后选择custom search engine,然后在Basics标签中, 将Image search选项设置为ON,对于Sites to search部分,选择Search the entire web but emphasize included site选项。

<强>链接:

答案 1 :(得分:0)

我有用于下载图像的搜索API,以创建图像数据集,也许您应该看看这些!

  1. https://rapidapi.com/contextualwebsearch/api/web-search?endpoint=5b864ca4e4b085e3f407ecca

  2. https://github.com/hardikvasa/webb/blob/master/docs/Documentation.md

从文档上,我喜欢第二个完善的文档!