不能对html2text使用read()吗?

时间:2015-12-27 00:01:01

标签: python python-3.x urllib

我正在制作一个在网页上搜索单词的Python程序。虽然,当我尝试时

website = urllib.request.urlopen(url)
content = website.read()
website.close()
test = html2text.html2text(content)
print(test)

我收到此错误:

test = html2text.html2text(content)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-    packages/html2text/__init__.py", line 840, in html2text
return h.handle(html)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-  packages/html2text/__init__.py", line 129, in handle
self.feed(data)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/html2text/__init__.py", line 125, in feed
data = data.replace("</' + 'script>", "</ignore>")
TypeError: a bytes-like object is required, not 'str'

我是Python的新手,所以我不确定如何处理这个错误 Python 3.5,Mac。

1 个答案:

答案 0 :(得分:2)

decode()Charset标题(reference)内发送charset的内容:

resource = urllib.request.urlopen(url)
content = resource.read()
charset = resource.headers.get_content_charset()
content = content.decode(charset)

适合我(Python 3.5,Mac OS)。