如何获得硒的下一页评论?

时间:2019-10-01 13:46:21

标签: python selenium web-scraping beautifulsoup web-crawler

我正在尝试从https://www.innisfree.com/kr/ko/ProductReviewList.do

中抓取10页以上的评论

但是,当我移至下一页并尝试获得新页的评论时,我仍然仅获得第一页的评论。

我使用driver.execute_script(“ goPage(2)”)和time.sleep(5),但是我的代码只给我第一页的评论。

'''我没有使用for循环只是为了查看page1和page2之间的结果是否不同''' '''我进口了美丽的汤和硒'''

这是我的代码:

  url = "https://www.innisfree.com/kr/ko/ProductReviewList.do"

  chromedriver = r'C:\Users\hhm\Downloads\chromedriver_win32\chromedriver.exe'

  driver = webdriver.Chrome(chromedriver)

  driver.get(url)


  print("this is page 1")

  driver.execute_script("goPage(1)")

  nTypes = soup.select('.reviewList ul .newType div[class^=reviewCon] .reviewConTxt')


  for nType in nTypes:

         product = nType.select_one('.pdtName').text

         print(product)


 print('\n')

 print("this is page 2")

 driver.execute_script("goPage(2)")

 time.sleep(5)

 nTypes = soup.select('.reviewList ul .newType div[class^=reviewCon] .reviewConTxt')


 for nType in nTypes:

         product = nType.select_one('.pdtName').text

         print(product)

2 个答案:

答案 0 :(得分:0)

如果第二页作为新窗口打开,则需要切换到另一页并将硒控件切换到另一窗口

示例:

# Opens a new tab
self.driver.execute_script("window.open()")

# Switch to the newly opened tab
self.driver.switch_to.window(self.driver.window_handles[1])

来源:

How to switch to new window in Selenium for Python?

https://www.techbeamers.com/switch-between-windows-selenium-python/

答案 1 :(得分:0)

尝试以下代码。您需要单击每个分页链接才能转到下一页。您将获得所有100条评论。

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import time
url = "https://www.innisfree.com/kr/ko/ProductReviewList.do"
chromedriver = r'C:\Users\hhm\Downloads\chromedriver_win32\chromedriver.exe'
driver = webdriver.Chrome(chromedriver)
driver.get(url)

for i in range(2,12):
   time.sleep(2)
   soup=BeautifulSoup(driver.page_source,'html.parser')
   nTypes = soup.select('.reviewList ul .newType div[class^=reviewCon] .reviewConTxt')
   for nType in nTypes:
      product = nType.select_one('.pdtName').text
      print(product)
   if i==11:
    break
   nextbutton=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//span[@class='num']/a[text()='" +str(i)+"']")))
   driver.execute_script("arguments[0].click();",nextbutton)