我用python与硒结合编写了一个脚本,用于从网页的表中解析某些字段。我关注的字段位于标题Home
和Handicap
中。我可以在标题Home
中获取内容,但是无法在标题Handicap
中获取内容。我怎么能得到它?
这是我到目前为止的尝试:
import time
from selenium import webdriver
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
driver.get("http://info.nowgoal.com/en/League/2018-2019/36.html")
time.sleep(3) #intentional delay to let the webpage load it's content
soup = BeautifulSoup(driver.page_source,"lxml")
for items in soup.select('table#Table3 tr'):
name = items.find_all("td")[2].text
# stat = items.find_all("td")[5].text #this is not working
print(name)
driver.quit()
答案 0 :(得分:2)
前两行只是标题。要获取值,您需要遍历除前两行以外的所有行 :
for items in soup.select('table#Table3 tr')[2:]:
name = items.find_all("td")[2].text
stat_ft = items.find_all("td")[5].text
stat_ht = items.find_all("td")[6].text
print(name, stat_ft, stat_ht)