如何将输出复制到文本文件中的bs4中的终端

时间:2018-12-07 17:40:34

标签: python-2.7 beautifulsoup html-parsing

我是第一次使用bs4。如果我使用以下基本代码:

from bs4 import BeautifulSoup
with open ('test.txt','r') as f:
    soup = BeautifulSoup(f)
    print f

终端中的输出非常干净,并且不包含html标签。如果我尝试将其打印到txt文件,它会提示我添加解析器,因此我添加了“ html.parser”。我没有得到相同的结果,即,它充满了我想摆脱的标签。如何在txt文件中获得相同的结果?

from bs4 import BeautifulSoup
with open ('test.txt','r') as f:
    soup = BeautifulSoup(f,'html.parser')
    with open ('test2.txt', 'w') as x:
        x.write(str(soup))

* EDIT这是我运行此代码时test2.txt中内容的示例:

    each\u00a0row you want to accept.\n <li>At the top of the list, 
    under the <b>Batch Actions</b> drop-down arrow, 
    choose\u00a0<b>Accept Selected</b>.</li>\n <li>All the selected 
    transactions\u00a0move from the <b>For Review

但是在终端机上我得到了:

    each\u00a0row you want to accept.\n At the top of the list, under 
    the Batch Actions drop-down arrow, choose\u00a0Accept Selected.\n 
    All the selected transactions\u00a0move from the For Review 
    tab\u00a0to the In QuickBooks 

1 个答案:

答案 0 :(得分:1)

尝试添加.text属性

x.write(str(soup.text))