检索关键字的Google索引时出现TypeError

时间:2013-10-06 14:02:42

标签: python indexing

我们通过Google找到了此代码。它应该为我们提供关键字的Google索引。问题是它工作了一段时间然后给我们这个错误:

./g1.py size hassize
Traceback (most recent call last):
  File "./g1.py", line 22, in <module>
    n2 = int(gsearch(args[0]+" "+args[1])['cursor']['estimatedResultCount'])
TypeError: 'NoneType' object is unsubscriptable

代码:

#!/usr/bin/env python

import math,sys
import json
import urllib

def gsearch(searchfor):
  query = urllib.urlencode({'q': searchfor})
  url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % query
  search_response = urllib.urlopen(url)
  search_results = search_response.read()
  results = json.loads(search_results)
  data = results['responseData']
  return data

args = sys.argv[1:]
m = 45000000000
if len(args) != 2:
        print "need two words as arguments"
        exit
n0 = int(gsearch(args[0])['cursor']['estimatedResultCount'])
n1 = int(gsearch(args[1])['cursor']['estimatedResultCount'])
n2 = int(gsearch(args[0]+" "+args[1])['cursor']['estimatedResultCount'])

1 个答案:

答案 0 :(得分:1)

我运行了几次代码,发现每次查询都会返回一个data['responseData'] = None的对象。这导致您报告的错误。为了找出有时出现这种情况的原因,我在data时输出整个data['responseData'] = None对象,并找到以下内容:

{u'responseData': None,
 u'responseDetails': u'Suspected Terms of Service Abuse.
                       Please see http://code.google.com/apis/errors',
 u'responseStatus': 403}

您的某些请求似乎正在返回HTTP 403 Forbidden状态代码,因此您的查询未得到满足(因此没有数据)。我会阅读the page on Google's Terms of Service,看看您是否可以弄清楚为什么您的某些请求可能违反了其服务条款。