Question

我尝试使用Python中的selenium-webdriver从公共网站http://www.seaaroundus.org抓取数据。我正在尝试使用以下代码来清除this网页上列表选项的值。该列表位于滚动框内，部分可见。当我从xpath中提取文本时，它只返回列表中的前11个项目。有没有办法提取列表中所有项目的文本？我试图遍历不同项目的xpath但它们似乎在每第11个项目之后重复，所以它们循环分解。我必须为大约300个类似的网页执行此操作。非常感谢任何线索！ Screenshot here

import time
from selenium import webdriver

chrome_path = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)

#1 open website
driver.get("http://www.seaaroundus.org/data/#/eez/8/exploited-organisms")
time.sleep(5)

#xpath of where all the taxa names are listed
x_path = """//*[@id="exploited-organisms"]/sau-taxon-grid/div[2]/div[1]"""

#printing the xpath.text only prints the first 11 items
print(driver.find_element_by_xpath(x_path).text)

Answer 1

正如@Florent B.正确建议您只需使用直接HTTP请求API即可获取所需数据：

使用Python中的Selenium从部分可见的滚动框中删除所有文本

1 个答案: