使用selenium python按类名查找第n个元素

时间:2015-05-11 21:07:51

标签: python ajax angularjs selenium selector

我昨天刚刚开始使用selenium来帮助抓取一些数据,而且我很难将我的脑袋缠绕在选择器引擎上。我知道lxml,BeautifulSoup,jQuery和Sizzle有类似的引擎。但我要做的是:

  1. 等待10秒钟让页面完全加载
  2. 确保存在十个或更多span.eN元素(两个负载在初始页面加载后更多)
  3. 然后用beautifulsoup
  4. 开始处理数据

    我正在努力寻找第n个元素或定位仅存在于第n个元素中的特定文本的硒条件。我一直收到错误(超时,NoSuchElement等)

        url = "http://someajaxiandomain.com/that-injects-html-after-pageload.aspx"
        wd = webdriver.Chrome()
        wd.implicitly_wait(10)
        wd.get(url)
        # what I've tried
        # .find_element_by_xpath("//span[@class='eN'][10]"))
        # .until(EC.text_to_be_present_in_element(By.CSS_SELECTOR, "css=span[class='eN']:contains('foo')"))
    

1 个答案:

答案 0 :(得分:3)

您需要了解Explicit Waits和预期条件等待的概念。

在您的情况下,您可以编写custom Expected Condition来等待定位器找到的元素数等于n

from selenium.webdriver.support import expected_conditions as EC

class wait_for_n_elements_to_be_present(object):
    def __init__(self, locator, count):
        self.locator = locator
        self.count = count

    def __call__(self, driver):
        try:
            elements = EC._find_elements(driver, self.locator)
            return len(elements) >= self.count
        except StaleElementReferenceException:
            return False

用法:

n = 10  # specify how many elements to wait for

wait = WebDriverWait(driver, 10)
wait.until(wait_for_n_elements_to_be_present((By.CSS_SELECTOR, 'span.eN'), n))

也许您也可以使用内置的预期条件,例如presence_of_element_locatedvisibility_of_element_located,并等待单个span.eN元素出现或可见,例如:< / p>

wait = WebDriverWait(driver, 10)
wait.until(presence_of_element_located((By.CSS_SELECTOR, 'span.eN')))