我正在尝试使用python从谷歌地图中删除一个地方的评论数量。例如Pike's Landing餐厅(见下面的谷歌地图网址)有162条评论。我想在python中提取这个数字。
网址:https://www.google.com/maps?cid=15423079754231040967
我并不熟悉HTML,但是从互联网上的一些基本示例中我编写了以下代码,但我得到的是运行此代码后的黑色变量。如果你能让我知道我在这里错了什么,我将不胜感激。
from urllib.request import urlopen
from bs4 import BeautifulSoup
quote_page ='https://www.google.com/maps?cid=15423079754231040967'
page = urlopen(quote_page)
soup = BeautifulSoup(page, 'html.parser')
price_box = soup.find_all('button',attrs={'class':'widget-pane-link'})
print(price_box.text)
答案 0 :(得分:0)
在没有API的情况下,使用纯Python很难做到,这就是我的结尾(请注意,我在URL的末尾添加了&hl=en
,以获得英语结果而不是我的语言): / p>
import re
import requests
from ast import literal_eval
urls = [
'https://www.google.com/maps?cid=15423079754231040967&hl=en',
'https://www.google.com/maps?cid=16168151796978303235&hl=en']
for url in urls:
for g in re.findall(r'\[\\"http.*?\d+ reviews?.*?]', requests.get(url).text):
data = literal_eval(g.replace('null', 'None').replace('\\"', '"'))
print(bytes(data[0], 'utf-8').decode('unicode_escape'))
print(data[1])
打印:
http://www.google.com/search?q=Pike's+Landing,+4438+Airport+Way,+Fairbanks,+AK+99709,+USA&ludocid=15423079754231040967#lrd=0x51325b1733fa71bf:0xd609c9524d75cbc7,1
469 reviews
http://www.google.com/search?q=Sequoia+TreeScape,+Newmarket,+ON+L3Y+8R5,+Canada&ludocid=16168151796978303235#lrd=0x882ad2157062b6c3:0xe060d065957c4103,1
42 reviews
答案 1 :(得分:0)
您需要查看页面的源代码并解析 window.APP_INITIALIZATION_STATE
变量块,在那里您会找到所有需要的数据。
或者,您可以使用来自 SerpApi 的 Google Maps Reviews API。
示例 JSON 输出:
"place_results": {
"title": "Pike's Landing",
"data_id": "0x51325b1733fa71bf:0xd609c9524d75cbc7",
"reviews_link": "https://serpapi.com/search.json?engine=google_maps_reviews&hl=en&place_id=0x51325b1733fa71bf%3A0xd609c9524d75cbc7",
"gps_coordinates": {
"latitude": 64.8299557,
"longitude": -147.8488774
},
"place_id_search": "https://serpapi.com/search.json?data=%214m5%213m4%211s0x51325b1733fa71bf%3A0xd609c9524d75cbc7%218m2%213d64.8299557%214d-147.8488774&engine=google_maps&google_domain=google.com&hl=en&type=place",
"thumbnail": "https://lh5.googleusercontent.com/p/AF1QipNtwheOCQ97QFrUNIwKYUoAPiV81rpiW5cIiQco=w152-h86-k-no",
"rating": 3.9,
"reviews": 839,
"price": "$$",
"type": [
"American restaurant"
],
"description": "Burgers, seafood, steak & river views. Pub fare alongside steak & seafood, served in a dining room with river views & a waterfront patio.",
"service_options": {
"dine_in": true,
"curbside_pickup": true,
"delivery": false
}
}
要集成的代码:
import os
from serpapi import GoogleSearch
params = {
"engine": "google_maps",
"type": "search",
"q": "pike's landing",
"ll": "@40.7455096,-74.0083012,14z",
"google_domain": "google.com",
"api_key": os.getenv("API_KEY"),
}
search = GoogleSearch(params)
results = search.get_dict()
reviews = results["place_results"]["reviews"]
print(reviews)
输出:
839
<块引用>
免责声明,我为 SerpApi 工作。