如何从搜索查询中检索谷歌URL

时间:2014-12-12 01:33:45

标签: python search term

所以我试图创建一个Python脚本来获取搜索词或查询,然后搜索谷歌搜索该词。然后它应该从搜索词的结果中返回5个URL。

我花了很多时间试图让PyGoogle工作。但后来发现Google不再支持SOAP API进行搜索,也不提供新的许可证密钥。简而言之,PyGoogle在这一点上已经非常死了。

所以我的问题是......最简洁/最简单的方法是什么?

我想在Python中完全这样做。

感谢您的帮助

3 个答案:

答案 0 :(得分:1)

使用BeautifulSoup并请求从Google搜索结果中获取链接

import requests
from bs4 import BeautifulSoup
keyword = "Facebook" #enter your keyword here
search = "https://www.google.co.uk/search?sclient=psy-ab&client=ubuntu&hs=k5b&channel=fs&biw=1366&bih=648&noj=1&q=" + keyword
r = requests.get(search)
soup = BeautifulSoup(r.text, "html.parser")
container = soup.find('div',{'id':'search'})
url = container.find("cite").text
print(url)

答案 1 :(得分:0)

您对pygoogle有什么问题?我知道它不再受支持了,但是我已经在很多场合使用过该项目,并且它可以很好地完成你所描述的琐碎任务。

你的问题确实让我感到好奇 - 所以我去了谷歌并输入了" python google search"。 Bam,发现this repository。安装了点子,在浏览他们的文档后5分钟内得到了你的要求:

import google
for url in google.search("red sox", num=5, stop=1):
    print(url)

也许下次再尝试,好吗?

答案 2 :(得分:0)

此处,link xgoogle图书馆也是如此。

我尝试过类似的前10个链接,这也计算我们定位的链接中的字词。我已添加代码段供您参考:

import operator
import urllib
#This line will import GoogleSearch, SearchError class from xgoogle/search.py file
from xgoogle.search import GoogleSearch, SearchError
my_dict = {}
print "Enter the word to be searched : "
#read user input
yourword = raw_input()
try:
  #This will perform google search on our keyword
  gs = GoogleSearch(yourword)
  gs.results_per_page = 80
  #get google search result
  results = gs.get_results()
  source = ''
  #loop through all result to get each link and it's contain
  for res in results:
     #print res.url.encode('utf8')
     #this will give url
     parsedurl = res.url.encode("utf8")
     myurl = urllib.urlopen(parsedurl)
     #above line will read url content, in below line we parse the content of that web page
     source = myurl.read()
     #This line will count occurrence of enterd keyword in our webpage
     count = source.count(yourword)
     #We store our result in dictionary data structure. For each url, we store it word occurent. Similar to array, this is dictionary
     my_dict[parsedurl] = count
except SearchError, e:
  print "Search failed: %s" % e
print my_dict
#sorted_x = sorted(my_dict, key=lambda x: x[1])

for key in sorted(my_dict, key=my_dict.get, reverse=True):
    print(key,my_dict[key])