我正在尝试在此网站上保存表格> https://www.valuewalk.com/2019/01/top-10-most-obese-countries-oecd-who/
它确实打印了出来,但没有保存为CSV。有人可以帮忙提些建议吗?
from bs4 import BeautifulSoup
import csv
#Request webpage content
result = requests.get('https://www.valuewalk.com/2019/01/top-10-most-obese-countries-oecd-who/')
#Save content in var
src = result.content
#soupactivate
soup = BeautifulSoup(src,'lxml')
#look for table
tbl = soup.findAll('ol')
tbl2 = tbl[1]
#Get text out of table
tbltxt = tbl2.get_text()
#Open CSV
file = open('obesecountries.csv','w')
writer = csv.writer(file)
#Put data into csv
for row in tbltxt:
writer.writerow(row)
我找到了我想离开的HTML表。我删除了HTML标签。 它会打印出来,但不会以CSV格式保存/写入。
答案 0 :(得分:2)
tbltxt
是字符串,而不是列表。您应该遍历<li>
元素。
writerow()
的参数应该是列表,而不是字符串。
for li in tbl2.findAll('li'):
rowtext = li.get_text()
write.writerow([rowtext])
答案 1 :(得分:1)
#Open CSV
file = open('obesecountries.csv','w')
writer = csv.writer(file)
#look for table
tbl = soup.findAll('ol')
#Put data into csv
for row in tbl:
# get the text from the second item in the row
txt = [row[1].get_text()]
#Get text out of table
writer.writerow(txt)