Question

一直在尝试使用此xgoogle在互联网上搜索pdf ..我遇到的问题是，如果我搜索＆＃34;医药：pdf＆＃34;第一页返回给我不是谷歌返回的第一页，即如果我实际上使用谷歌....不知道什么是错的这里是ma代码

     try:
         page = 0   
         gs = GoogleSearch(searchfor)
         gs.results_per_page = 100
         results = []
         while page < 2:
             gs.page=page
             results += gs.get_results()
             page += 1
     except SearchError, e:
            print "Search failed: %s" % e             
     for res in results:
         print res.desc

如果我实际上使用谷歌网站搜索查询第一页谷歌显示为我是：标题：医学 - 英国文化协会说明：英国的医学培训有着悠久的历史和卓越的历史......世界各地的医学领导者都接受过医学教育网址：http://www.britishcouncil.org/learning-infosheets-medicine.pdf
但如果我使用我的python Xgoogle搜索，我会得到：
Python OutPut
记述：UCM175757.pdf
标题：我家的药品：学生的介绍 - 食品和药品......
网址：http://www.fda.gov/downloads/Drugs/ResourcesForYou/Consumers/BuyingUsingMedicineSafely/UnderstandingOver-the-CounterMedicines/UCM175757.pdf

Answer 1

我注意到在浏览器中使用xgoogle和使用谷歌是有区别的。我不知道为什么，但你可以试试谷歌自定义搜索API。谷歌自定义搜索API可以为您提供更接近的结果，并且没有被谷歌禁止的风险（如果您在短时间内多次使用xgoogle，则会返回错误而不是搜索结果。）

首先，您必须在Google中注册并启用自定义搜索才能获得密钥和cx https://www.google.com/cse/all

api格式是：

“https://www.googleapis.com/的 customsearch / <强> V1 的键= yourkey ＆安培;？的 CX = yourcx ＆安培; alt = json ＆amp; q = yourquery '

customsearch是您要使用的google功能，在您的情况下我认为是自定义搜索

v1是您应用的版本

yourkey和yourcx是谷歌提供的，你可以在仪表板上找到它

yourquery是您要搜索的术语，在您的情况下是“Medicine：pdf”

json是返回格式

示例返回Google自定义搜索结果的前3页：

import urllib2 import urllib import simplejson def googleAPICall(): userInput = urllib.quote("global warming") KEY = "##################" # get yours CX = "###################" # get yours for i in range(0,3): index = i*10+1 url = ('https://scholar.googleapis.com/customsearch/v1?' 'key=%s' '&cx=%s' '&alt=json' '&q=%s' '&num=10' '&start=%d')%(KEY,CX,userInput,index) request = urllib2.Request(url) response = urllib2.urlopen(request) results = simplejson.load(response)

如何让xgoogle返回谷歌第一页

1 个答案: