如何在Google搜索中以英语返回结果

时间:2019-04-12 13:25:24

标签: python beautifulsoup python-requests

我正在尝试搜索Google的某些产品,但是Google返回的结果语言取决于代理,我尝试使用标题中的'accept-language': 'en-US,en;q=0.9'对其进行修复,但仍然没有用

import requests
from bs4 import BeautifulSoup
products=["Majestic Pet Stairs Steps","Ball Jars Wide Mouth Lids 12/Pack","LED Duck Color Changing Floating Speaker"]
for product in products:
    headers = {
    'authority': 'www.google.com',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36',
    'accept-language': 'en-US,en;q=0.9'}
    url = 'https://google.com/search?q={}'.format(product)
    PROXY = None 
    res=requests.get(url,headers=headers,proxies=PROXY)
    if res.status_code!=200:
        print("bad proxy")
        break
    soup = BeautifulSoup(res.text,"lxml")
    print(soup.title.text)

我想要的是始终以英语获取结果(无论代理如何)

3 个答案:

答案 0 :(得分:1)

它们提供了用于搜索的API:https://developers.google.com/custom-search/v1/overview

如果您通过网络抓取进行大量自动查询,他们很可能会开始设置验证码或屏蔽您。

答案 1 :(得分:1)

我有一个方便的库供我搜索,这是我的应用程序中的一个片段:

通过点子安装Google进行安装,RFC

from googlesearch import search
results = list(search(str(tag)+' '+str(intitle), domains = ['stackoverflow.com'], stop = SITE.page_size))

答案 2 :(得分:0)

您是否尝试在请求链接中放置 uule=locationhl=enlr=lang_eng 参数?

response = request.get(`https://google.com/search?q=FUS RO DAH&hl=en`)

或者使用参数 dict

params = {
   'q': 'FUS RO DAH',
   'hl': 'en', #  the language to use for the Google search
   'gl': 'us' # the country to use for the Google search
   'lr': 'lang_en' # one or multiple languages to limit the search to
   'uule': 'w+CAIQICIGQnJhemls' #Brazil # defines encoded location you want to use for the search
}
import requests
from bs4 import BeautifulSoup

headers = {
   'user-agent': 
   'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36',
}

products = ["Majestic Pet Stairs Steps", "Ball Jars Wide Mouth Lids 12/Pack", "LED Duck Color Changing Floating Speaker"]

for product in products:
   params = {
   'q':  f'{product}',
   'hl': 'en',
   'gl': 'us' 
   'lr': 'lang_en'
   }
   html = requests.get(f'https://www.google.com/search', headers=headers, params=params)
   soup = BeautifulSoup(html.text, 'html.parser')
   print(soup)

或者,您可以使用来自 SerpApi 的 Google Search Engine Results API 来做同样的事情。这是一个付费 API,可免费试用 5,000 次搜索。查看playground

from serpapi import GoogleSearch

params = {
  "api_key": "YOUR_API_KEY",
  "engine": "google",
  "q": "spotlight 29 casino address",
  "google_domain": "google.com.br",
  "gl": "br",
  "hl": "pt",
  "uule": "w+CAIQICIGQnJhemls", # can't be used together with location
}
<块引用>

免责声明,我为 SerpApi 工作。