BeautifulSoup类型错误

时间:2012-11-04 18:06:17

标签: beautifulsoup

我不明白这个错误。如何让“内容”变得可写?

from bs4 import BeautifulSoup

soup = BeautifulSoup(open("http://www.asdf.fi/asdf.html"))

content = soup.find(id="content") 

with open("test.html", "a") as myfile:
    myfile.write(content)

错误:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: expected a character buffer object

2 个答案:

答案 0 :(得分:1)

首先,您无法使用open()打开网页。您需要使用urllib库(实际上我使用mechanize库,它更容易使用)。

其次,open()返回file个对象,该对象无法传递给BeautifulSoup()。你需要写一些类似

的东西
soup = BeautifulSoup(open(filename).read())

.read()读取整个文件并返回字符缓冲区,可用于调用BeautifulSoup()

答案 1 :(得分:0)

好的,经过一番搜索......

with open("test.html", "a") as myfile:
    myfile.write(content.encode('utf-8'))