我的问题非常简单,但作为Python的初学者,我仍然找不到答案..
我使用以下代码从网上提取一些数据:
from bs4 import BeautifulSoup
import urllib2
teams = ("http://walterfootball.com/fantasycheatsheet/2015/traditional")
page = urllib2.urlopen(teams)
soup = BeautifulSoup(page, "html.parser")
f = open('output.txt', 'w')
nfl = soup.findAll('li', "player")
lines = [span.get_text(strip=True) for span in nfl]
lines = str(lines)
f.write(lines)
f.close()
但输出相当混乱。
有没有一种优雅的方式来获得这样的结果?
1. Eddie Lacy, RB, Green Bay Packers. Bye: 7 $60
2. LeVeon Bell, RB, Pittsburgh Steelers. Bye: 11 $60
3. Marshawn Lynch, RB, Seattle Seahawks. Bye: 9 $59
...
答案 0 :(得分:1)
只需在列表中使用str.join
并.rstrip("+")
关闭+
:
nfl = soup.findAll('li', "player")
lines = ("{}. {}\n".format(ind,span.get_text(strip=True).rstrip("+"))
for ind, span in enumerate(nfl,1))
print("".join(lines))
哪会给你:
1. Eddie Lacy, RB, Green Bay Packers. Bye: 7$60
2. LeVeon Bell, RB, Pittsburgh Steelers. Bye: 11$60
3. Marshawn Lynch, RB, Seattle Seahawks. Bye: 9$59
4. Adrian Peterson, RB, Minnesota Vikings. Bye: 5$59
5. Jamaal Charles, RB, Kansas City Chiefs. Bye: 9$54
..................
要分开我们可以分割的价格,或使用re.sub
在美元符号前添加空格并写下每一行:
import re
with open('output.txt', 'w') as f:
for line in lines:
line = re.sub("(\$\d+)$", r" \1", line, 1)
f.write(line)
现在输出是:
1. Eddie Lacy, RB, Green Bay Packers. Bye: 7 $60
2. LeVeon Bell, RB, Pittsburgh Steelers. Bye: 11 $60
3. Marshawn Lynch, RB, Seattle Seahawks. Bye: 9 $59
4. Adrian Peterson, RB, Minnesota Vikings. Bye: 5 $59
5. Jamaal Charles, RB, Kansas City Chiefs. Bye: 9 $54
您可以str.rsplit
在$
上拆分一次并重新加入空格,也可以这样做:
with open('output.txt', 'w') as f:
for line in lines:
line,p = line.rsplit("$",1)
f.write("{} ${}".format(line,p))
答案 1 :(得分:0)
遍历列表lines
并写下每一行:
for num, line in enumerate(lines, 1):
f.write('{}. {}\n'.format(num, line))
enumerate
用于获取(num, line)
对。
with
语句而不是手动关闭文件对象:
with open('output.txt', 'w') as f:
for num, line in enumerate(lines, 1):
f.write('{}. {}\n'.format(num, line))