我需要什么:计算所有语言的 Chrome商店中扩展程序下的评论数。 我做了什么:尝试了BeautifulSoup来提取特定标签。我重新设置了页面的html代码,并找到了一个评论标签:
尝试过此代码:
from bs4 import BeautifulSoup
import requests
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html5lib')
comments = soup.find_all('div', class_ = 'ba-bc-Xb ba-ua-zl-Xb')
但是print(comments)
显示数组为空。
此刻我被困住了,我发现我还需要处理两个问题:
答案 0 :(得分:0)
您可以使用selenium执行任务,等待页面更改,然后从PaginationMessage
中提取评论计数。测试了几个链接。您可能需要为没有评论的项目添加错误处理。似乎也有一些POST XHR活动产生了您可能希望浏览的审阅JSON字符串。
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
url = 'https://chrome.google.com/webstore/detail/evernote-web-clipper/pioclpoplcdbaefihamjohnefbikjilc?hl=en/'
#url = 'https://chrome.google.com/webstore/detail/https-everywhere/gcbommkclmclpchllfjekcdonpmejbdp?hl=en/'
d = webdriver.Chrome()
d.get(url)
WebDriverWait(d, 5).until(EC.visibility_of_element_located((By.ID, ':21'))).click()
ActionChains(d).click_and_hold(WebDriverWait(d, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.h-z-Ba-ca.ga-dd-Va.g-aa-ca')))).perform()
languageSelection = WebDriverWait(d, 5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, '.g-aa-ca-ma-x-L')))
languageSelection[1].click()
s= WebDriverWait(d, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.Aa.dc-tf + span'))).text
print(s.split()[-1])
d.quit()
答案 1 :(得分:0)
尝试
dfr$date <- dfr[cbind(1:nrow(dfr), dfr$year - 2013)]