Question

我遇到了一个网站，我希望从中获取一些数据。但是由于我有限的Python知识，该网站似乎是不可废弃的。使用driver.find_element_by_xpath时，我经常遇到超时异常。

使用我在下面提供的代码，我希望点击第一个结果并转到新页面。在新页面上，我想要刮取产品标题和包装尺寸。但无论我如何尝试，我都无法让Python为我点击正确的东西。更不用说抓取数据了。有人可以帮忙吗？

我想要的输出是：

三（三苯基膦）铑（I）氯化物，98％ 190420010个
1 GR 87.60
5 GR 367.50

这些是我到目前为止的代码：

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = "http://www.acros.com/"
cas = "14694-95-2"  # need to select for the appropriate one

driver = webdriver.Firefox()
driver.get(url)

country = driver.find_element_by_name("ddlLand")
for option in country.find_elements_by_tag_name("option"):
    if option.text == "United States":
        option.click()
driver.find_element_by_css_selector("input[type = submit]").click() 

choice = driver.find_element_by_name("_ctl1:DesktopThreePanes1:ThreePanes:_ctl4:ddlType")
for option in choice.find_elements_by_tag_name("option"):
    if option.text == "CAS registry number":
        option.click()

inputElement = driver.find_element_by_id("_ctl1_DesktopThreePanes1_ThreePanes__ctl4_tbSearchString")
inputElement.send_keys(cas)
driver.find_element_by_id("_ctl1_DesktopThreePanes1_ThreePanes__ctl4_btnGo").click()

Answer 1

您提供的代码可以正常使用，因为它会将Firefox实例定向到显示搜索结果的http://www.acros.com/DesktopModules/Acros_Search_Results/Acros_Search_Results.aspx?search_type=CAS&SearchString=14694-95-2。

如果您在该页面上找到iframe元素：

＆lt; iframe id =“searchAllFrame”allowtransparency =“”background-color =“transparent”frameborder =“0”width =“1577”height =“3000”scrolling =“auto”src =“http：// newsearch .chemexper.com /其它/托管/ acrosPlugin / center.shtml查询= 14694-95-2＆放大器;检索类别= CAS和放大器;货币=安培;国家= NULL和放大器;语言= EN＆放大器; forGroupNames = AcrosOrganics，FisherSci，MaybridgeBB，BioReagents公司，FisherLCMS和放大器;服务器= www.acros.com“＆GT;＆LT; / iframe中＆GT;

并使用driver.switch_to.frame切换到该框架，然后我认为您想要的数据应该可以从那里删除，例如：

driver.switch_to.frame(driver.find_element_by_xpath("//iframe[@id='searchAllFrame']"))

然后您可以像往常一样继续使用驱动程序来查找该iframe中的元素。（我认为switch_to_frame的工作方式类似，但已被弃用。）

（我似乎无法为switch_to找到一个体面的文档链接，this并没有那么有用。

python selenium webscraping - 无法获取数据

1 个答案: