我正在尝试点击网页上的标签,如下所示。不幸的是,它似乎只是点击一些选项卡,尽管在检查Chrome时正确的xpath是正确的。我只能假设它没有单击所有选项卡,因为没有使用完整的xpath。
然而.. 我试过更改xpath:
//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"]
到:
//div[@class='KambiBC-event-groups-list']//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"]
FOR:
clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.XPATH,'(//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"])[%s]' % str(index + 1))))
然而问题仍然存在。 我也尝试过使用CSS:
#KambiBC-contentWrapper__bottom > div > div > div > div > div.KambiBC-quick-browse-container.KambiBC-quick-browse-container--list-only-mode > div.KambiBC-quick-browse__list.KambiBC-delay-scroll--disabled > div > div.KambiBC-time-ordered-list-container > div.KambiBC-time-ordered-list-content > div > div > div.KambiBC-collapsible-container.KambiBC-mod-event-group-container > header
然而,这一直给我错误...... 对于:
clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR,'("#KambiBC-contentWrapper__bottom > div > div > div > div > div.KambiBC-quick-browse-container.KambiBC-quick-browse-container--list-only-mode > div.KambiBC-quick-browse__list.KambiBC-delay-scroll > div > div.KambiBC-time-ordered-list-container > div.KambiBC-time-ordered-list-content > div > div > div > header")[%s]' % str(index + 1))))
应该注意的是,我想点击所有未打开的标签,我似乎无法使用CSS选择器来查找足够的特定元素,因为我认为在这种情况下它不允许缩小类元素。
有没有办法解决这个不点击一切的问题?
应该注意我正在使用......
索引中的索引:
indexes = [index for index in range(len(options))]
shuffle(indexes)
for index in indexes:
是否有更优雅的方式用于1循环?
[import sys
sys.exit()][1]
完整code
答案 0 :(得分:4)
这将循环显示每个联赛1比1的所有比赛,根据需要收集所有相关数据。您可以在每个查询前加.
前缀并通过match.find_element_by_xpath('.//your-query-here')
选择匹配,从而在每场比赛中收集更多数据。如果这样做,请告诉我!
import sys, io, os, csv, requests, time
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from selenium import webdriver
driver = webdriver.Chrome()
driver.set_window_size(1024, 600)
driver.maximize_window()
try:
os.remove('vtg121.csv')
except OSError:
pass
driver.get('https://www.unibet.com.au/betting#filter/football')
time.sleep(1)
clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.XPATH,
('//div[@class="KambiBC-collapsible-container '\
'KambiBC-mod-event-group-container"]'))))
time.sleep(0)
xp_opened = '//div[contains(@class, "KambiBC-expanded")]'
xp_unopened = '//div[@class="KambiBC-collapsible-container ' \
'KambiBC-mod-event-group-container" ' \
'and not(contains(@class, "KambiBC-expanded"))]'
opened = driver.find_elements_by_xpath(xp_opened)
unopened = driver.find_elements_by_xpath(xp_unopened)
data = []
for league in opened:
xp_matches = './/li[contains(@class,"KambiBC-event-item")]'
matches = league.find_elements_by_xpath(xp_matches)
try:
# League Name
xp_ln = './/span[@class="KambiBC-mod-event-group-header__main-title"]'
ln = league.find_element_by_xpath(xp_ln).text.strip()
except:
ln = None
print(ln)
for match in matches:
# get all the data per 'match group'
xp_team1_name = './/button[@class="KambiBC-mod-outcome"][1]//' \
'span[@class="KambiBC-mod-outcome__label"]'
xp_team1_odds = './/button[@class="KambiBC-mod-outcome"][1]//' \
'span[@class="KambiBC-mod-outcome__odds"]'
xp_team2_name = './/button[@class="KambiBC-mod-outcome"][3]//' \
'span[@class="KambiBC-mod-outcome__label"]'
xp_team2_odds = './/button[@class="KambiBC-mod-outcome"][3]//' \
'span[@class="KambiBC-mod-outcome__odds"]'
try:
team1_name = match.find_element_by_xpath(xp_team1_name).text
except:
team1_name = None
try:
team1_odds = match.find_element_by_xpath(xp_team1_odds).text
except:
team1_odds = None
try:
team2_name = match.find_element_by_xpath(xp_team2_name).text
except:
team2_name = None
try:
team2_odds = match.find_element_by_xpath(xp_team2_odds).text
except:
team2_odds = None
data.append([ln, team1_name, team1_odds, team2_name, team2_odds])
for league in unopened:
league.click()
time.sleep(0.5)
matches = league.find_elements_by_xpath(xp_matches)
try:
ln = league.find_element_by_xpath(xp_ln).text.strip()
except:
ln = None
print(ln)
for match in matches:
try:
team1_name = match.find_element_by_xpath(xp_team1_name).text
except:
team1_name = None
try:
team1_odds = match.find_element_by_xpath(xp_team1_odds).text
except:
team1_odds = None
try:
team2_name = match.find_element_by_xpath(xp_team2_name).text
except:
team2_name = None
try:
team2_odds = match.find_element_by_xpath(xp_team2_odds).text
except:
team2_odds = None
data.append([ln, team1_name, team1_odds, team2_name, team2_odds])
with open('vtg121.csv', 'a', newline='', encoding="utf-8") as outfile:
writer = csv.writer(outfile)
for row in data:
writer.writerow(row)
print(row)
答案 1 :(得分:1)
OP's code without extra imports
发生错误是因为site对标签OP的XPath不是连续的。它有差距。例如,现在我找不到
// * [@ ID = “KambiBC-contentWrapper__bottom”] / DIV / DIV / DIV / DIV / DIV [3] / DIV 1 / DIV / DIV [3] / DIV [2] /格/ DIV / DIV [<强> 2 强>] /报头
不久之前,在游戏上线之前,我找不到
// * [@ ID = “KambiBC-contentWrapper__bottom”] / DIV / DIV / DIV / DIV / DIV [3] / DIV 1 / DIV / DIV [3] / DIV [2] /格/ DIV / DIV [<强> 1 强>] /报头
当我谈到index
时,我指的是上面的大胆部分。
当游戏上线时,标签突然将索引从2变为1.(粗体部分会发生变化。)在这两种情况下,都存在间隙:无法找到1或无法找到2。
我猜测,有差距的原因是因为中间还有另一个不可点击的元素。见下图。
league
是造成差距的原因。 因此,只要代码遇到索引league
占用,就会超时。由于League
按钮和其他标签会切换League
和实时游戏的位置,因此当位置发生变化时会更换索引。 (我认为这就是为什么我找不到Xpath,大胆的部分首先是1,后来就找不到2。)
以下是OP代码的一部分。最后你可以看到str(index + 1)。
indexes = [index for index in range(len(options))] #
shuffle(indexes) # the OP use shuffle from random. Still 0 and 1 is contained.
path = '(//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"])'
for index in indexes:
# Because there are some indexes are missing because of League button,
# nothing can be found at the index and it times out.
clickMe = wait(driver, 10).until(
EC.element_to_be_clickable((By.XPATH, path + '[%s]' % str(index + 1))))
尝试捕获超时异常以跳过League
占用的索引。您还可以保留一个计数器,以便只允许在一个页面上捕获一个超时异常。如果有第二次超时,您知道除了League
按钮之外还有其他错误,应该停止。
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
driver = webdriver.Firefox()
driver.set_window_size(1024, 600)
driver.maximize_window()
wait = WebDriverWait
driver.get('https://www.unibet.com.au/betting#filter/football')
time.sleep(5)
options = driver.find_elements_by_xpath("""//*[@id="KambiBC-contentWrapper__bottom"]/div/div/div/div/div[3]/div[1]/div/div[3]/div[2]/div/div/div""")
print("Total tabs that we want to open is {}".format(len(options)))
indexes = [index for index in range(len(options))]
for index in indexes:
print(index)
try:
clickMe = wait(driver, 5).until(EC.presence_of_element_located((By.XPATH,
"""//*[@id="KambiBC-contentWrapper__bottom"]/div/div/div/div/div[3]/div[1]/div/div[3]/div[2]/div/div/div[{}]/header""".format(str(index+1)))))
clickMe.click()
except TimeoutException as ex:
print("catch you! {}".format(index))
pass