我目前正在做一个关于足球运动员的项目。我正在尝试对足球运动员进行一些公开评论,以进行情绪分析。但是我似乎无法刮擦评论。任何帮助将非常感激。这是我似乎无法做的评论部分。奇怪的是,我让它工作了,但是随后它停止了,我似乎再也没有收到任何评论。我要抓取的网站是:https://sofifa.com/player/192985/kevin-de-bruyne/200025/
likes = []
dislikes = []
follows = []
comments = []
driver_path = '/Users/niallmcnulty/Desktop/GeneralAssembly/Lessons/DSI11-lessons/week05/day2_web_scraping_and_apis/web_scraping/selenium-examples/chromedriver'
driver = webdriver.Chrome(executable_path=driver_path)
# i = 0
for url in tqdm_notebook(urls):
driver.get(url)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
sleep(0.2)
soup1 = BeautifulSoup(driver.page_source,'lxml')
try:
dislike = soup1.find('button', attrs = {'class':'bp3-button bp3-minimal bp3-intent-danger dislike-btn need-sign-in'}).find('span',{'class':'count'}).text.strip()
dislikes.append(dislike)
except:
pass
try:
like = soup1.find('button', attrs = {'class':'bp3-button bp3-minimal bp3-intent-success like-btn need-sign-in'}).find('span',{'class':'count'}).text.strip()
likes.append(like)
except:
pass
try:
follow = soup1.find('button', attrs = {'class':'bp3-button bp3-minimal follow-btn need-sign-in'}).find('span',{'class':'count'}).text.strip()
follows.append(follow)
except:
pass
try:
comment = soup1.find_all('p').text[0:10]
comments.append(comment)
except:
pass
# i += 1
# if i % 5 == 0:
# sentiment = pd.DataFrame({"dislikes":dislikes,"likes":likes,"follows":follows,"comments":comments})
# sentiment.to_csv('/Users/niallmcnulty/Desktop/GeneralAssembly/Lessons/DSI11-lessons/projects/cap-csv/sentiment.csv')
sentiment_final = pd.DataFrame({"dislikes":dislikes,"likes":likes,"follows":follows,"comments":comments})
# df_sent = pd.merge(df, sentiment, left_index=True, right_index=True)
答案 0 :(得分:0)
评论部分是动态加载的。您可以尝试使用驱动程序捕获它,
try:
comment_elements = driver.find_elements_by_tag_name('p')
for comment in comment_elements:
comments.append(comment.text)
except:
pass
print(Comments)