循环点击表中的锚标签,但使用css选择器python selenium

时间:2018-01-31 05:21:27

标签: python selenium web-scraping css-selectors

有没有办法只用特定的css选择器/文本/标签/等点击特定的锚点标签,只需使用href BUT?

这是我要点击的链接:http://www.oddsportal.com/soccer/germany/bundesliga/

注意:表格为table#tournamentTable.table-main,中间的表格。

有一张桌子,我想点击显示的每个链接。 (我已经知道有一种叫做“is_displayed”的东西,但我还没有达到这个问题,所以这不是主要问题)

问题在于,据我所知,我还没想出如何只点击我想要的那些! (其中有一个独特的css选择器)

这是我目前的代码(我会指出我正在努力的地方):

#Load Modules
from selenium import webdriver
from time import sleep
from random import randint

#Set profile
profile = webdriver.FirefoxProfile()
profile.set_preference("Set Profile")

#Open driver
driver = webdriver.Firefox(profile)
driver.get("http://www.oddsportal.com/soccer/germany/bundesliga/")


#Find the elements (FIRST IDEA)
list_links=driver.find_elements_by_css_selector(".name.table-participant")
"""
These ones are not clickable BUT this is the list that I want.
"""
#Print
for i in list_links:
    print(i.text)


#(SECOND IDEA)
list_links=driver.find_elements_by_partial_link_text('-')
"""
Pretty close I guess, but there are still some 
elements that I don't want
"""
#Print
for i in list_links:
    print(i.text)


#(THIRD IDEA(S)) None of them works
list_links=driver.find_elements_by_xpath("//a[contains(text(),'/bundesliga/')]")
list_links=driver.find_element_by_xpath('//a[@href=soccer/germany/bundesliga/]')
list_links=driver.find_elements_by_css_selector(".name.table-participant[class='href']")
"""
I guess I'm pretty close but I can't get the click!
"""
#Print
for i in list_links:
    print(i.text)


#End

顺便说一下,我偶然发现了这个:selenium click on anchor tag inside table td

我知道是与xpath有关但是......如何组合包含特定css选择器的xpath以及锚点击标签?

有什么想法吗?

1 个答案:

答案 0 :(得分:1)

您可以使用以下行来获取所有团队对:

 matches = [pair.text for pair in driver.find_elements_by_css_selector("td>a[href^='/soccer/germany/bundesliga/']") if pair.text]

输出:

['FC Koln - Dortmund', 'Freiburg - Bayer Leverkusen', 'Hertha Berlin - Hoffenhei
m', 'Mainz - Bayern Munich', 'Schalke - Werder Bremen', 'Wolfsburg - Stuttgart',
 'B. Monchengladbach - RB Leipzig', 'Augsburg - Eintracht Frankfurt', 'Hamburger
 SV - Hannover', 'RB Leipzig - Augsburg', 'Bayer Leverkusen - Hertha Berlin', 'D
ortmund - Hamburger SV', 'Eintracht Frankfurt - FC Koln', 'Hannover - Freiburg',
 'Hoffenheim - Mainz', 'Bayern Munich - Schalke', 'Stuttgart - B. Monchengladbac
h', 'Werder Bremen - Wolfsburg']

表格中有一些隐藏的链接。如果您还想获得这些链接,请尝试:

matches = [pair.get_attribute('textContent') for pair in driver.find_elements_by_css_selector("td>a[href^='/soccer/germany/bundesliga/']")]