我编写了一个脚本来解析网站第一个表格中的数据。我用xpath来解析表。顺便说一下,我没有使用“tr”标签原因而不使用它我仍然可以在打印时在控制台中看到结果。当我运行我的脚本时,数据被抓取但在csv文件中以单行打印。我无法找出我正在犯的错误。对此的任何意见将受到高度赞赏。这是我尝试过的:
import csv
import requests
from lxml import html
url="https://fantasy.premierleague.com/player-list/"
response = requests.get(url).text
outfile=open('Data_tab.csv','w', newline='')
writer=csv.writer(outfile)
writer.writerow(["Player","Team","Points","Cost"])
tree = html.fromstring(response)
for titles in tree.xpath("//table[@class='ism-table']")[0]:
# tab_r = titles.xpath('.//tr/text()')
tab_d = titles.xpath('.//td/text()')
writer.writerow(tab_d)
答案 0 :(得分:1)
您可能希望添加循环级别,依次检查每个表格行。
试试这个:
for titles in tree.xpath("//table[@class='ism-table']")[0]:
for row in titles.xpath('./tr'):
tab_d = row.xpath('./td/text()')
writer.writerow(tab_d)
或者,或许这个:
table = tree.xpath("//table[@class='ism-table']")[0]
for row in table.xpath('.//tr'):
items = row.xpath('./td/text()')
writer.writerow(items)
或者您可以让第一个XPath表达式为您找到行:
rows = tree.xpath("(.//table[@class='ism-table'])[1]//tr")
for row in rows:
items = row.xpath('./td/text()')
writer.writerow(items)