Question

我正在尝试编写python脚本，以显示给定搜索查询的google前5个结果的链接。

我用的是漂亮的汤，在检查google的html之后，我发现搜索结果链接可以在标签“ div class =“ r”'和'a href'内找到。

import bs4, requests

mySearch=input()
address='http://www.google.com/search?q='+mySearch
googleRes=requests.get(address)

googleSoup=bs4.BeautifulSoup(googleRes.text)
linkBlocks=googleSoup.select('div.r a')

但是，列表linkBlocks为空，而不是被搜索结果链接填充。如何将搜索结果链接添加到linkBlocks列表中。

Answer 1

使用User-Agent

import bs4, requests
headers = {'User-Agent':
       'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}
mySearch="beautifulsoup"
address='http://www.google.com/search?q='+mySearch
googleRes=requests.get(address,headers=headers)
googleSoup=bs4.BeautifulSoup(googleRes.text,'html.parser')
linkBlocks=googleSoup.select('div.r a')
print(linkBlocks)

Python：为什么不使用beautifulsoup的google webscraping代码返回搜索结果？

1 个答案: