我正试图让它成为这个脚本
from BeautifulSoup import BeautifulSoup
import sys, re, urllib2
import codecs
html_str = urllib2.urlopen(URL).read()
soup = BeautifulSoup(html_str)
for row in soup.findAll("tr"):
for col in row.findAll(re.compile("td|th")):
for
sys.stdout.write((col.string if col.string else '') + '|')
print # Newline
将其输出发送到文本文件。
答案 0 :(得分:4)
最简单的? (如果* nix): -
python file.py > filename.txt
代码明智: -
from BeautifulSoup import BeautifulSoup
import sys, re, urllib2
import codecs
html_str = urllib2.urlopen(URL).read()
soup = BeautifulSoup(html_str)
file = open('file.txt', 'w')
for row in soup.findAll("tr"):
for col in row.findAll(re.compile("td|th")):
file.write((col.string if col.string else '') + '|')
file.close()