在等待时使用Selenium中的Xpath获取元素的第n个出现

时间:2016-03-16 15:01:05

标签: python selenium xpath

我正在使用Selenium抓取一个大型静态网页。我提前知道页面上会出现多少<a>个元素。由于它是一个非常大的页面,我想确保在尝试刮擦之前它已完全加载。我的解决方案是等到最后一个<a>元素被加载。我尝试使用presence_of_element_located,如下所示:

driver.get(url)
try:
    WebDriverWait(driver, 500).until(EC.presence_of_element_located(driver.find_elements_by_xpath('//*[@title="View recipe"]')[count]))
except TimeoutException:

但它引发了一个错误:

Traceback (most recent call last):


File "/home/noname365/siteCrawler/test.py", line 28, in <module>
    WebDriverWait(driver, 500).until(EC.presence_of_element_located(driver.find_elements_by_xpath('//*[@title="View recipe on foodily.com"]')[10 -1]))
  File "/home/noname365/virtualenvs/env35/lib/python3.5/site-packages/selenium/webdriver/support/wait.py", line 71, in until
    value = method(self._driver)
  File "/home/noname365/virtualenvs/env35/lib/python3.5/site-packages/selenium/webdriver/support/expected_conditions.py", line 59, in __call__
    return _find_element(driver, self.locator)
  File "/home/noname365/virtualenvs/env35/lib/python3.5/site-packages/selenium/webdriver/support/expected_conditions.py", line 274, in _find_element
    return driver.find_element(*by)
TypeError: find_element() argument after * must be a sequence, not WebElement

我在这里做错了什么?

2 个答案:

答案 0 :(得分:2)

presence_of_element_located()和其他预期条件,在第一个和唯一的参数中,期望一个定位器类型的元组作为第一个项目,定位器值作为第二个项目:

from selenium.webdriver.common.by import By
EC.presence_of_element_located((By.XPATH, '//*[@title="View recipe"]'))

答案 1 :(得分:0)

为什么不等待页面加载?这里的C#代码应该与Python类似。 JavaScript部分应完全相同。

protected void WaitForDocumentReadyStateComplete()
{
    try
    {
        new WebDriverWait(target.Driver, TimeSpan.FromSeconds(DefaultTimeoutInSeconds)).Until(
            d => ((IJavaScriptExecutor) d).ExecuteScript("return document.readyState").Equals("complete"));
        // Safari (Mac) sometimes hangs for 30 seconds then throws WebDriverTimeoutException => can safely be ignored
    }
    catch (Exception)
    {
        if (!target.IsSafari)
        {
            // MSIE (Win) sometimes throws "UnexpectedJavaScriptError" => Workaround: wait maximum time
            Thread.Sleep(DefaultTimeoutInSeconds * 1000);
        }
    }
}

或者您等待预期的网址(如果您重新加载网页并且网址没有更改,则可能无效):

new WebDriverWait(target.Driver, TimeSpan.FromSeconds(DefaultTimeoutInSeconds)).Until(
    ExpectedConditions.UrlMatches(MatchUrl));