向下滚动谷歌地图网页

时间:2019-06-26 14:38:51

标签: python selenium web-scraping

我正在尝试在Google地图网页上收集评论,但是我无法找到一种方法来向下滚动页面以获取所有评论。

我正在使用python 3和Selenium软件包,

我发现了向下滚动社交媒体页面(例如facebook或IG)的不同方法,但是该代码在google map上不起作用。 另外,我试图找到body标记的结束键,也没有用。

如果有人可以帮助我,

先谢谢您

3 个答案:

答案 0 :(得分:1)

您可以使用硒中的execute_script滚动到页面底部。

browser = webdriver.Chrome('chromedriver.exe', options=self.browserProfile)
brownser.get('your url here')
browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")

如果未在正确的区域上滚动,则可以单击包含评论的div,然后执行window.scrollTo

 review_box= lambda: self.browser.find_element_by_xpath("xpath to div")
 review_box().click()

希望这会有所帮助! :)

答案 1 :(得分:0)

您也可以在没有浏览器自动化的情况下获得所有评论。

您只需要 data_id(它看起来像这样:0x89c259a61c75684f:0x79d31adb123348d2,您可以从某个地点的地图 URL 或页面源中获取它)

enter image description here

之后,您只需向以下地址发出请求:https://www.google.com/async/reviewDialog?hl=en&async=feature_id:0x89c259a61c75684f:0x79d31adb123348d2,sort_by:,next_page_token:,associated_topic:,_fmt:pc

您会在那里找到所有评论数据以及 next_page_token,以便您查询接下来的 10 条评论。

在这种情况下,next_page_token 是:EgIICg

因此,接下来 10 条评论的请求 URL 将是:https://www.google.com/async/reviewDialog?hl=en&async=feature_id:0x89c259a61c75684f:0x79d31adb123348d2,sort_by:,next_page_token:EgIICg,associated_topic:,_fmt:pc

您也可以使用第三方解决方案,例如 SerpApi。这是一个免费试用的付费 API。我们为您处理代理、解析验证码并解析所有丰富的结构化数据。

示例 Python 代码(也可在其他库中使用):

from serpapi import GoogleSearch

params = {
  "api_key": "secret_api_key",
  "engine": "google_maps_reviews",
  "hl": "en",
  "data_id": "0x89c259a61c75684f:0x79d31adb123348d2"
}

search = GoogleSearch(params)
results = search.get_dict()

示例 JSON 输出:

"place_info": {
  "title": "Stumptown Coffee Roasters",
  "address": "18 W 29th St, New York, NY",
  "rating": 4.6,
  "reviews": 1343
},
"reviews": [
  {
    "user": {
      "name": "Julie Fowler",
      "link": "https://www.google.com/maps/contrib/117619864295803167167?hl=en-US&sa=X&ved=2ahUKEwj53uvVmuDxAhVkGFkFHYUJCDcQvvQBegQIARAy",
      "thumbnail": "https://lh3.googleusercontent.com/a-/AOh14GjzGpT8uNmp89FcD1OZ8nPouEOUwTLUJ4npewsY=s40-c-c0x00000000-cc-rp-mo-br100",
      "reviews": 4
    },
    "rating": 5,
    "date": "4 days ago",
    "snippet": "We popped by one day for a coffee, and the next day for a full-on breakfast. I loved the selection of coffee, teas, and non-alcoholic drinks. Friendly service topped off a great morning! They also have an excellent choice of coffee and a warm place to sit and visit. We will come back and recommend this place to my friends."
  },
  {
    "user": {
      "name": "Aida Penn",
      "link": "https://www.google.com/maps/contrib/100457332315069730904?hl=en-US&sa=X&ved=2ahUKEwj53uvVmuDxAhVkGFkFHYUJCDcQvvQBegQIARA-",
      "thumbnail": "https://lh3.googleusercontent.com/a-/AOh14GjfRec0fNDPIVR5F-68IFjJTLTGd_QYmXE7j5J8=s40-c-c0x00000000-cc-rp-mo-br100",
      "reviews": 1
    },
    "rating": 5,
    "date": "3 days ago",
    "snippet": "We had an early morning tour, and this was one of the only places open before 8 am. We went as soon as they opened to order cake and coffee to go. The staff were also very accommodating and nice even early in the morning. The coffee we got was good, but I wish we had time to go back and try more of the coffee here. I highly recommend this place!"
  },
  {
    "user": {
      "name": "Rhona Warren",
      "link": "https://www.google.com/maps/contrib/106431868819623834724?hl=en-US&sa=X&ved=2ahUKEwj53uvVmuDxAhVkGFkFHYUJCDcQvvQBegQIARBL",
      "thumbnail": "https://lh3.googleusercontent.com/a-/AOh14Gip2mBMR7-oIfh9z6JK1xr2O6SvDd-je_zFuZsq=s40-c-c0x00000000-cc-rp-mo-br100",
      "reviews": 1
    },
    "rating": 5,
    "date": "3 days ago",
    "snippet": "I stopped in for coffee then decided to order an espresso Oreo milkshake. Very delicious, and the prices were very reasonable. The staff was friendly, and there is nice outdoor patio seating. Also, the food was deliciously rich in flavor accompanied by terrific coffee and fresh fruit smoothies. I strongly recommend this place!"
  },
  ...
]

查看documentation了解更多详情。

免责声明:我在 SerpApi 工作。

答案 2 :(得分:-1)

向下滚动硒的一种方法是单击页面上的特定元素。如果您执行以下操作...

reviews_divs = driver.find_elements_by_class_name('section-review')
reviews_divs[-1].click()

...它将单击上一个评论,因此滚动至该评论。不错,但还不是您想要的,因为该页面未加载新结果...

但是,如果您单击div section-review之后的第一个div,则会正确加载新评论:

driver.find_element_by_class_name('section-loading').click()

编辑

当您进入要测试的商店页面时,单击“显示所有评论”后,以上代码将生效:

driver.find_element_by_css_selector("button[class*='__button-text'").click()