Question

我正在尝试从网址中提取数据，但在写入文件时我收到此错误，因为text不为空。

我的代码：

def gettextonly(self, url):
        url = url

        html = urllib.urlopen(url).read()
        soup = BeautifulSoup(html)

        # kill all script and style elements
        for script in soup(["script", "style","a","<div id=\"bottom\" >"]):
            script.extract()    # rip it out

        text = soup.findAll(text=True)

        #print text
        fo = open('foo.txt', 'w')
        fo.seek(0, 2)
        if text:
            line =fo.writelines(text.encode('utf8'))
        fo.close()

错误：

in gettextonly
    line =fo.writelines(text.encode('utf8'))
AttributeError: 'ResultSet' object has no attribute 'encode'

Answer 1

soup.findAll(text=True)返回一个ResultSet对象，该对象基本上是一个没有属性encode的列表。您要么使用.text代替：

text = soup.text

或者，“加入”文本：

text = "".join(soup.findAll(text=True))

AttributeError：＆＃39; ResultSet＆＃39;对象没有属性＆＃39;编码＆＃39;

1 个答案: