Question

为什么当我添加time.sleep（2）时，我得到了我想要的输出但是如果我添加等待直到特定的xpath它会得到更少的结果？

使用time.sleep（2）输出（也所需）：

Adelaide Utd
Tottenham
Dundee Fc
 ...

数：145个名字

删除time.sleep

Adelaide Utd
Tottenham
Dundee Fc
 ...

数：119名

我已添加：

clickMe = wait(driver,    13).until(EC.element_to_be_clickable((By.CSS_SELECTOR,    ("#page-container > div:nth-child(4) > div >    div.ubet-sports-section-page > div > div:nth-child(2) > div > div >    div:nth-child(1) > div > div > div.page-title-new > h1"))))

由于此元素出现在所有pages上。

似乎要少得多。我怎样才能解决这个问题？

脚本：

import csv
import os

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.ui import WebDriverWait as wait


driver = webdriver.Chrome()
driver.set_window_size(1024, 600)
driver.maximize_window()

driver.get('https://ubet.com/sports/soccer')



clickMe = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, ('//select[./option="Soccer"]/option'))))

options = driver.find_elements_by_xpath('//select[./option="Soccer"]/option')


indexes = [index for index in range(len(options))]
for index in indexes:


    try:
        try:
            zz = wait(driver, 10).until(
                EC.element_to_be_clickable((By.XPATH, '(//select/optgroup/option)[%s]' % str(index + 1))))
            zz.click()
        except StaleElementReferenceException:
            pass

        from selenium.webdriver.support.ui import WebDriverWait
        def find(driver):
            pass

        from selenium.common.exceptions import StaleElementReferenceException, NoSuchElementException
        import time
        clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ("#page-container > div:nth-child(4) > div > div.ubet-sports-section-page > div > div:nth-child(2) > div > div > div:nth-child(1) > div > div > div.page-title-new > h1"))))

        langs0 = driver.find_elements_by_css_selector(
            "div > div > div > div > div > div > div > div > div.row.collapse > div > div > div:nth-child(2) > div > div > div > div > div > div.row.small-collapse.medium-collapse > div:nth-child(1) > div > div > div > div.lbl-offer > span")
        langs0_text = []

        for lang in langs0:
            try:
                langs0_text.append(lang.text)
            except StaleElementReferenceException:
                pass


        directory = 'C:\\A.csv' #####################################
        with open(directory, 'a', newline='', encoding="utf-8") as outfile:
            writer = csv.writer(outfile)
            for row in zip(langs0_text):
                writer.writerow(row)
    except StaleElementReferenceException:
        pass

如果您无法访问页面，则需要vpn。

更新...

也许该元素在其他元素之前加载。因此，如果我们将其更改为datascraped（并非所有页面都有要删除的数据）。

添加：

try:
    clickMe = wait(driver, 13).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ("div > div > div > div > div > div > div > div > div.row.collapse > div > div > div:nth-child(2) > div > div > div > div > div > div.row.small-collapse.medium-collapse > div:nth-child(3) > div > div > div > div.lbl-offer > span"))))
except TimeoutException as ex:
    pass

同样的问题仍然存在

手动步骤：

打开页面https://ubet.com/sports/soccer/nexttoplay/
点击下一步，在选择竞争下进行游戏
点击下拉列表中的第一个元素（对我来说是英格兰总理）
等待页面加载通过waituntil＆＃39;团队名称＆＃39;存在。＆＃39;球队名称＆＃39;定义如下。
抓取数据＆＃39;团队名称＆＃39;
抓取数据＆＃39;团队名称＆＃39;对于下拉列表中的所有元素。接下来就是英格兰法杯对我来说。我们想要抓取所有数据＆＃39;团队名称＆＃39;直到它去通过整个下拉列表。

团队名称=

div > div > div > div > div > div > div > div > div.row.collapse > div > div > div:nth-child(2) > div > div > div > div > div > div.row.small-collapse.medium-collapse > div:nth-child(1) > div > div > div > div.lbl-offer > span

等到不等待元素加载，因此数据输出不正确

0 个答案: