我需要抓取Google购物,例如此链接https://www.google.com/?gfe_rd=cr&ei=BtcRWeX_D8aAsAHDgZ2QAw#q=hooker+furniture+5183-75300&tbm=shop
但是在服务器的响应中我只收到没有项目的测试。甚至在谷歌浏览器的源代码查看器中我也看不到项目的详细信息。 什么请求会得到我所有项目的详细数据?
答案 0 :(得分:1)
您可以通过以下方式实现:
beautifulsoup
+ requests
库。不需要 selenium
,因为您需要的一切都在 HTML 源代码中。使用 Ctrl+U 查看它,然后再决定使用哪个工具来抓取它。另外,请确保您使用的是 user-agent
。 List 的 user-agents
。代码和full example:
from bs4 import BeautifulSoup
import requests
import lxml
import json
headers = { # <-- so the Google will treat your script as a "real" user browser.
"User-Agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
response = requests.get(
'https://www.google.com/search?q=minecraft+toys&tbm=shop',
headers=headers).text
soup = BeautifulSoup(response, 'lxml')
data = []
for container in soup.findAll('div', class_='sh-dgr__content'):
title = container.find('h4', class_='A2sOrd').text
price = container.find('span', class_='a8Pemb').text
supplier = container.find('div', class_='aULzUe IuHnof').text
data.append({
"Title": title,
"Price": price,
"Supplier": supplier,
})
print(json.dumps(data, indent = 2, ensure_ascii = False))
部分输出:
[
{
"Title": "Lego Minecraft The Creeper Mine Building Set",
"Price": "$63.99",
"Supplier": "Walmart - Elevate Service Online"
},
{
"Title": "LEGO Minecraft The Mountain Cave (21137)",
"Price": "$139.95",
"Supplier": "Game Yore"
},
{
"Title": "Lego Minecraft The Nether Portal Set",
"Price": "$92.36",
"Supplier": "eBay - davesworkshop"
},
{
"Title": "Lego Minecraft Toy, The Pig House",
"Price": "$49.95",
"Supplier": "Walmart - Sheen Empire"
}
]
或者,您也可以使用 SerpApi:
from serpapi import GoogleSearch
import os
params = {
"engine": "google",
"q": "minecraft toys",
"tbm": "shop",
"api_key": os.getenv("API_KEY"),
}
search = GoogleSearch(params)
results = search.get_dict()
for result in results['shopping_results']:
print(f"Title: {result['title']}\nPrice: {result['price']}\nSupplier: {result['source']}\n")
部分输出:
Title: Lego Minecraft The Creeper Mine Building Set4.8104
Price: $79.99
Supplier: Target
Title: LEGO Minecraft The Mountain Cave (21137)4.732
Price: $139.95
Supplier: Game Yore
Title: Lego Minecraft The Nether Portal Set4.787More options
Price: $92.36
Supplier: eBay - davesworkshop
Title: Lego Minecraft Toy, The Pig House4.850
Price: $43.99 $49.99
Supplier: Best Buy
Title: Lego 21160 Minecraft The Illager Raid4.9203
Price: $47.99
Supplier: Target
Title: Minecraft Kids Craft-A-Block Figures Assortment
Price: $12.00
Supplier: Selfridges
<块引用>
免责声明,我为 SerpApi 工作。