为什么当我添加time.sleep(2)时,我得到了我想要的输出但是如果我添加等待直到特定的xpath它会得到更少的结果?
使用time.sleep(2)输出(也所需):
Adelaide Utd
Tottenham
Dundee Fc
...
数:145个名字
删除time.sleep
Adelaide Utd
Tottenham
Dundee Fc
...
数:119名
我已添加:
clickMe = wait(driver, 13).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ("#page-container > div:nth-child(4) > div > div.ubet-sports-section-page > div > div:nth-child(2) > div > div > div:nth-child(1) > div > div > div.page-title-new > h1"))))
由于此元素出现在所有pages上。
似乎要少得多。我怎样才能解决这个问题?
脚本:
import csv
import os
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.ui import WebDriverWait as wait
driver = webdriver.Chrome()
driver.set_window_size(1024, 600)
driver.maximize_window()
driver.get('https://ubet.com/sports/soccer')
clickMe = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, ('//select[./option="Soccer"]/option'))))
options = driver.find_elements_by_xpath('//select[./option="Soccer"]/option')
indexes = [index for index in range(len(options))]
for index in indexes:
try:
try:
zz = wait(driver, 10).until(
EC.element_to_be_clickable((By.XPATH, '(//select/optgroup/option)[%s]' % str(index + 1))))
zz.click()
except StaleElementReferenceException:
pass
from selenium.webdriver.support.ui import WebDriverWait
def find(driver):
pass
from selenium.common.exceptions import StaleElementReferenceException, NoSuchElementException
import time
clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ("#page-container > div:nth-child(4) > div > div.ubet-sports-section-page > div > div:nth-child(2) > div > div > div:nth-child(1) > div > div > div.page-title-new > h1"))))
langs0 = driver.find_elements_by_css_selector(
"div > div > div > div > div > div > div > div > div.row.collapse > div > div > div:nth-child(2) > div > div > div > div > div > div.row.small-collapse.medium-collapse > div:nth-child(1) > div > div > div > div.lbl-offer > span")
langs0_text = []
for lang in langs0:
try:
langs0_text.append(lang.text)
except StaleElementReferenceException:
pass
directory = 'C:\\A.csv' #####################################
with open(directory, 'a', newline='', encoding="utf-8") as outfile:
writer = csv.writer(outfile)
for row in zip(langs0_text):
writer.writerow(row)
except StaleElementReferenceException:
pass
如果您无法访问页面,则需要vpn。
更新...
也许该元素在其他元素之前加载。因此,如果我们将其更改为datascraped(并非所有页面都有要删除的数据)。
添加:
try:
clickMe = wait(driver, 13).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ("div > div > div > div > div > div > div > div > div.row.collapse > div > div > div:nth-child(2) > div > div > div > div > div > div.row.small-collapse.medium-collapse > div:nth-child(3) > div > div > div > div.lbl-offer > span"))))
except TimeoutException as ex:
pass
同样的问题仍然存在
手动步骤:
点击下一步,在选择竞争下进行游戏
点击下拉列表中的第一个元素(对我来说是英格兰总理)
等待页面加载通过waituntil'团队名称'存在。 '球队名称'定义如下。
抓取数据'团队名称'
抓取数据'团队名称'对于下拉列表中的所有元素。接下来就是 英格兰法杯对我来说。我们想要抓取所有数据'团队名称'直到它去 通过整个下拉列表。
团队名称=
div > div > div > div > div > div > div > div > div.row.collapse > div > div > div:nth-child(2) > div > div > div > div > div > div.row.small-collapse.medium-collapse > div:nth-child(1) > div > div > div > div.lbl-offer > span