Question

from HTMLParser import HTMLParser

from urllib import urlopen

class Spider(HTMLParser):

        def __init__(self, url):
                HTMLParser.__init__(self)
                req = urlopen(url)
                self.feed(req.read())

        def handle_starttag(self, tag, attrs):
                if tag == 'a' and attrs:
                        print "Found link => %s" % attrs[0][1]

Spider('http://stackoverflow.com/questions/tagged/python')

Answer 1

python spider.py > output.html

Answer 2

将它放在脚本的顶部：

import sys
sys.stdout = file('output.html', 'w')

这会将脚本写入标准输出（包括print语句）的所有内容重定向到文件'output.html'。

Answer 3

我根本没有搞过Spider，但它是打印html，还是只打印“找到链接...”行？如果您只是打印它们，则可以执行outfl = open('output.txt')

之类的操作

然后，而不是print，请致电outfl.write("Found link => %s" % attrs[0][1])。

如果您需要HTML格式，您可以随时写出<html><head></head><body>，然后</body></html>。另外，使用outfl = open('output.html')代替.txt作为文件名。

我在这里完全错过了这个问题吗？如果你想要更好的答案，你应该更好地描述这个问题。

如何将此代码的输出写入HTML文件？

3 个答案: