我正在尝试从此网站抓取各个玩家的网址。
我已经尝试使用bs4进行此操作,并且每次我尝试查找该表时它仅返回[]。切换到lxml进行尝试。
import urlopen from urllib.requests
import lxml.html
url = "https://www.espn.com/soccer/team/squad/_/id/359/arsenal"
tree = etree.HTML(urlopen(url).read())
table = tree.xpath('/*
[@id="fittPageContainer"]/div[2]/div[5]/div[1]/div/article/div/section/div[5]/section/table/tbody/tr/td[1]/div/table/tbody/tr[1]/td/span')
print(table)
我希望可以使用某种排序输出来获取链接,但是代码返回方括号
答案 0 :(得分:0)
我想这就是你想要的。
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox(executable_path=r'C:\files\geckodriver.exe')
driver.set_page_load_timeout(30)
driver.get("https://www.espn.com/soccer/team/squad/_/id/359/arsenal")
continue_link = driver.find_element_by_tag_name('a')
elems = driver.find_elements_by_xpath("//a[@href]")
for elem in elems:
print(elem.get_attribute("href"))