...

Question

我有一个列表，由AJAX动态加载。首先，在加载时，它的代码是这样的：

<ul><li class="last"><a class="loading" href="#"><ins>&nbsp;</ins>Загрузка...</a></li></ul>

加载列表后，所有li和a都会被更改。它总是超过1立。像这样：

<ul class="ltr">
<li id="t_b_68" class="closed" rel="simple">
<a id="t_a_68" href="javascript:void(0)">Category 1</a>
</li>
<li id="t_b_64" class="closed" rel="simple">
<a id="t_a_64" href="javascript:void(0)">Category 2</a>
</li>
...

我需要检查列表是否已加载，因此我检查它是否有多个li。

到目前为止，我试过了：

1）自定义等待条件

class more_than_one(object):
    def __init__(self, selector):
        self.selector = selector

    def __call__(self, driver):
        elements = driver.find_elements_by_css_selector(self.selector)
        if len(elements) > 1:
            return True
        return False

...

try:
        query = WebDriverWait(driver, 30).until(more_than_one('li'))
    except:
        print "Bad crap"
    else:
        # Then load ready list

2）基于find_elements_by的自定义功能

def wait_for_several_elements(driver, selector, min_amount, limit=60):
    """
    This function provides awaiting of <min_amount> of elements found by <selector> with
    time limit = <limit>
    """
    step = 1   # in seconds; sleep for 500ms
    current_wait = 0
    while current_wait < limit:
        try:
            print "Waiting... " + str(current_wait)
            query = driver.find_elements_by_css_selector(selector)
            if len(query) > min_amount:
                print "Found!"
                return True
            else:
                time.sleep(step)
                current_wait += step
        except:
            time.sleep(step)
            current_wait += step

    return False

这不起作用，因为驱动程序（传递给此函数的当前元素）在DOM中丢失。 UL没有改变，但Selenium由于某种原因找不到它。

3）明显的等待。这很糟糕，因为有些列表会立即加载，有些列表会加载10秒以上。如果我使用这种技术，我必须等待每次出现的最大时间，这对我的情况非常不利。

4）我也不能正确等待XPATH的子元素。这个只是希望ul出现。

try:
    print "Going to nested list..."
    #time.sleep(WAIT_TIME)
    query = WebDriverWait(driver, 30).until(EC.presence_of_element_located((By.XPATH, './/ul')))
    nested_list = child.find_element_by_css_selector('ul')

请告诉我正确的方法，为指定的元素加载几个继承元素。

P.S。所有这些检查和搜索都应该与当前元素相关。

Answer 1

这是如何解决我想要等到通过AJAX完成加载的一定数量的帖子的问题

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

# create a new Chrome session
driver = webdriver.Chrome()

# navigate to your web app.
driver.get("http://my.local.web")

# get the search button
seemore_button = driver.find_element_by_id("seemoreID")

# Count the cant of post
seemore_button.click()

# Wait for 30 sec, until AJAX search load the content
WebDriverWait(driver,30).until(EC.visibility_of_all_elements_located(By.CLASS_NAME, "post"))) 

# Get the list of post
listpost = driver.find_elements_by_class_name("post")

Answer 2

（1）你没有提到你用它得到的错误

（2）你提到

...因为驱动程序（传递给此函数的当前元素）...

我认为这实际上是一个WebElement。在这种情况下，不要将对象本身传递给您的方法，只需传递找到WebElement的选择器（在您的情况下为ul）。如果“驱动程序在DOM中丢失”，可能是在while current_wait < limit:循环内重新创建它可以缓解问题

（3）是的，time.sleep()只会让你那么远

（4）由于加载的li元素动态包含class=closed，而不是(By.XPATH, './/ul')，您可以尝试(By.CSS_SELECTOR, 'ul > li.closed')（有关CSS选择器here的详细信息）

Answer 3

记住 Mr.E。和 Arran 的评论我在CSS选择器上完全遍历列表。棘手的部分是关于我自己的列表结构和标记（更改类等），以及在运行期间创建所需的选择器并在遍历期间将它们保留在内存中。

我通过搜索任何未加载状态的东西来等待几个元素。您也可以使用“：nth-child”选择器：

#in for loop with enumerate for i    
selector.append(' > li:nth-child(%i)' % (i + 1))  # identify child <li> by its order pos

这是我的硬评论代码解决方案，例如：

def parse_crippled_shifted_list(driver, frame, selector, level=1, parent_id=0, path=None):
    """
    Traversal of html list of special structure (you can't know if element has sub list unless you enter it).
    Supports start from remembered list element.

    Nested lists have classes "closed" and "last closed" when closed and "open" and "last open" when opened (on <li>).
    Elements themselves have classes "leaf" and "last leaf" in both cases.
    Nested lists situate in <li> element as <ul> list. Each <ul> appears after clicking <a> in each <li>.
    If you click <a> of leaf, page in another frame will load.

    driver - WebDriver; frame - frame of the list; selector - selector to current list (<ul>);
    level - level of depth, just for console output formatting, parent_id - id of parent category (in DB),
    path - remained path in categories (ORM objects) to target category to start with.
    """

    # Add current level list elements
    # This method selects all but loading. Just what is needed to exclude.
    selector.append(' > li > a:not([class=loading])')

    # Wait for child list to load
    try:
        query = WebDriverWait(driver, WAIT_LONG_TIME).until(
            EC.presence_of_all_elements_located((By.CSS_SELECTOR, ''.join(selector))))

    except TimeoutException:
        print "%s timed out" % ''.join(selector)

    else:
        # List is loaded
        del selector[-1]  # selector correction: delete last part aimed to get loaded content
        selector.append(' > li')

        children = driver.find_elements_by_css_selector(''.join(selector))  # fetch list elements

        # Walk the whole list
        for i, child in enumerate(children):

            del selector[-1]  # delete non-unique li tag selector
            if selector[-1] != ' > ul' and selector[-1] != 'ul.ltr':
                del selector[-1]

            selector.append(' > li:nth-child(%i)' % (i + 1))  # identify child <li> by its order pos
            selector.append(' > a')  # add 'li > a' reference to click

            child_link = driver.find_element_by_css_selector(''.join(selector))

            # If we parse freely further (no need to start from remembered position)
            if not path:
                # Open child
                try:
                    double_click(driver, child_link)
                except InvalidElementStateException:
                        print "\n\nERROR\n", InvalidElementStateException.message(), '\n\n'
                else:
                    # Determine its type
                    del selector[-1]  # delete changed and already useless link reference
                    # If <li> is category, it would have <ul> as child now and class="open"
                    # Check by class is priority, because <li> exists for sure.
                    current_li = driver.find_element_by_css_selector(''.join(selector))

                    # Category case - BRANCH
                    if current_li.get_attribute('class') == 'open' or current_li.get_attribute('class') == 'last open':
                        new_parent_id = process_category_case(child_link, parent_id, level)  # add category to DB
                        selector.append(' > ul')  # forward to nested list
                        # Wait for nested list to load
                        try:
                            query = WebDriverWait(driver, WAIT_LONG_TIME).until(
                                EC.presence_of_all_elements_located((By.CSS_SELECTOR, ''.join(selector))))

                        except TimeoutException:
                            print "\t" * level,  "%s timed out (%i secs). Failed to load nested list." %\
                                                 ''.join(selector), WAIT_LONG_TIME
                        # Parse nested list
                        else:
                            parse_crippled_shifted_list(driver, frame, selector, level + 1, new_parent_id)

                    # Page case - LEAF
                    elif current_li.get_attribute('class') == 'leaf' or current_li.get_attribute('class') == 'last leaf':
                        process_page_case(driver, child_link, level)
                    else:
                        raise Exception('Damn! Alien class: %s' % current_li.get_attribute('class'))

            # If it's required to continue from specified category
            else:
                # Check if it's required category
                if child_link.text == path[0].name:
                    # Open required category
                    try:
                        double_click(driver, child_link)

                    except InvalidElementStateException:
                            print "\n\nERROR\n", InvalidElementStateException.msg, '\n\n'

                    else:
                        # This element of list must be always category (have nested list)
                        del selector[-1]  # delete changed and already useless link reference
                        # If <li> is category, it would have <ul> as child now and class="open"
                        # Check by class is priority, because <li> exists for sure.
                        current_li = driver.find_element_by_css_selector(''.join(selector))

                        # Category case - BRANCH
                        if current_li.get_attribute('class') == 'open' or current_li.get_attribute('class') == 'last open':
                            selector.append(' > ul')  # forward to nested list
                            # Wait for nested list to load
                            try:
                                query = WebDriverWait(driver, WAIT_LONG_TIME).until(
                                    EC.presence_of_all_elements_located((By.CSS_SELECTOR, ''.join(selector))))

                            except TimeoutException:
                                print "\t" * level, "%s timed out (%i secs). Failed to load nested list." %\
                                                     ''.join(selector), WAIT_LONG_TIME
                            # Process this nested list
                            else:
                                last = path.pop(0)
                                if len(path) > 0:  # If more to parse
                                    print "\t" * level, "Going deeper to: %s" % ''.join(selector)
                                    parse_crippled_shifted_list(driver, frame, selector, level + 1,
                                                                parent_id=last.id, path=path)
                                else:  # Current is required
                                    print "\t" * level,  "Returning target category: ", ''.join(selector)
                                    path = None
                                    parse_crippled_shifted_list(driver, frame, selector, level + 1, last.id, path=None)

                        # Page case - LEAF
                        elif current_li.get_attribute('class') == 'leaf':
                            pass
                else:
                    print "dummy"

        del selector[-2:]

Answer 4

首要元素是AJAX元素。

现在，根据定位所有所需元素并创建 list 的要求，最简单的方法是为WebDriverWait引入visibility_of_all_elements_located()，您可以使用以下任一Locator Strategies：

使用CSS_SELECTOR：

elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "ul.ltr li[id^='t_b_'] > a[id^='t_a_'][href]")))

使用XPATH：

elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//ul[@class='ltr']//li[starts-with(@id, 't_b_')]/a[starts-with(@id, 't_a_') and starts-with(., 'Category')]")))

注意：您必须添加以下导入：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

如果您的用例是要等待一定数量的元素被加载，例如 10 个元素，可以使用，可以使用 lambda 函数，如下所示：

使用>：

myLength = 9
WebDriverWait(driver, 20).until(lambda driver: len(driver.find_elements_by_xpath("//ul[@class='ltr']//li[starts-with(@id, 't_b_')]/a[starts-with(@id, 't_a_') and starts-with(., 'Category')]")) > int(myLength))

使用==：

myLength = 10
WebDriverWait(driver, 20).until(lambda driver: len(driver.find_elements_by_xpath("//ul[@class='ltr']//li[starts-with(@id, 't_b_')]/a[starts-with(@id, 't_a_') and starts-with(., 'Category')]")) == int(myLength))

您可以在How to wait for number of elements to be loaded using Selenium and Python
中找到相关的讨论

参考文献

您可以在以下位置找到几个相关的详细讨论：

Answer 5

我创建了AllEc，它基本上是在WebDriverWait.until逻辑上进行的。

这将一直等待，直到发生超时或找到所有元素为止。

from typing import Callable
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import StaleElementReferenceException

class AllEc(object):
    def __init__(self, *args: Callable, description: str = None):
        self.ecs = args
        self.description = description

    def __call__(self, driver):
        try:
            for fn in self.ecs:
                if not fn(driver):
                    return False
            return True
        except StaleElementReferenceException:
            return False

# usage example:
wait = WebDriverWait(driver, timeout)
ec1 = EC.invisibility_of_element_located(locator1)
ec2 = EC.invisibility_of_element_located(locator2)
ec3 = EC.invisibility_of_element_located(locator3)

all_ec = AllEc(ec1, ec2, ec3, description="Required elements to show page has loaded.") 
found_elements = wait.until(all_ec, "Could not find all expected elements")

或者，我创建了AnyEc来查找多个元素，但是在找到的第一个元素上返回。

class AnyEc(object):
    """
    Use with WebDriverWait to combine expected_conditions in an OR.

    Example usage:

        >>> wait = WebDriverWait(driver, 30)
        >>> either = AnyEc(expectedcondition1, expectedcondition2, expectedcondition3, etc...)
        >>> found = wait.until(either, "Cannot find any of the expected conditions")
    """

    def __init__(self, *args: Callable, description: str = None):
        self.ecs = args
        self.description = description

    def __iter__(self):
        return self.ecs.__iter__()

    def __call__(self, driver):
        for fn in self.ecs:
            try:
                rt = fn(driver)
                if rt:
                    return rt
            except TypeError as exc:
                raise exc
            except Exception as exc:
                # print(exc)
                pass

    def __repr__(self):
        return " ".join(f"{e!r}," for e in self.ecs)

    def __str__(self):
        return f"{self.description!s}"

either = AnyEc(ec1, ec2, ec3)
found_element = wait.until(either, "Could not find any of the expected elements")

最后，如果可能的话，您可以尝试等待Ajax完成。这并非在所有情况下都有用-例如Ajax始终处于活动状态。在Ajax运行并完成的情况下，它可以工作。还有一些ajax库没有设置active属性，因此请仔细检查以确保可以依赖此属性。

def is_ajax_complete(driver)
    rt = driver.execute_script("return jQuery.active", *args)
    return rt == 0

wait.until(lambda driver: is_ajax_complete(driver), "Ajax did not finish")

Python Selenium等待加载几个元素

...

5 个答案:

参考文献