Question

我拥有下面一个非常愚蠢的解析器所需要的东西。我想创建一个方法，将HTML页面的URL（例如：http://www.dictionary.com/browse/example）作为参数接收，并使用此解析器向我显示它遇到的所有数据。我不需要有人给我解决方案。但是，建议将不胜感激。谢谢。

var hello = "hello<br>how\nare\nyou";
document.getElementById("hey").innerHTML = hello;

Answer 1

这就是我最终从网址中提取数据的方式，在本例中为http://python.org/。

from html.parser import HTMLParser
from urllib.request import urlopen

class MyHTMLParser(HTMLParser):
    def handle_data(self, data):
        print("Encountered some data  :", data)

parser = MyHTMLParser()
html = urlopen('http://python.org/')
thing = html.read()
parser.feed(thing.decode("utf-8"))

如何使用Python中的html.parser从特定的HTML链接中提取数据？

1 个答案: