将元素字符串中的“\ n”替换为Beautifulsoup中的<br/>标记

时间:2015-12-07 10:00:19

标签: python html tags beautifulsoup newline

我正在创建一个新标记并为字符串指定换行符

from bs4 import BeautifulSoup

soup = BeautifulSoup("", "html.parser")

myTag = soup.new_tag("div")
myTag.string = "My text \n with a new line"

soup.insert(0, myTag)

,结果是

<div>My text 
 with a new line</div>

正如所料。但是,换行符需要<br>标记才能正确呈现。

我怎样才能做到这一点?

1 个答案:

答案 0 :(得分:2)

我认为在该div上将CSS white-space属性设置为pre-wrap可能更好:

  

预先包装 - 浏览器会保留空白。文本将在必要时和换行符时进行换行。

一个例子:

<div style="white-space:pre-wrap"> Some \n text here </div>

在BeautifulSoup中执行此操作的代码:

myTag = soup.new_tag("div", style="white-space:pre-wrap")
myTag.string = "My text \n with a new line"

似乎替换\n并不简单,因为BeautifulSoup默认会转义HTML实体。另一种方法是拆分输入字符串并使用文本和<br>标记构建标记结构:

def replace_newline_with_br(s, soup):
    lines = s.split('\n')
    div = soup.new_tag('div')
    div.append(lines[0])
    for l in lines[1:]:
        div.append(soup.new_tag('br'))
        div.append(l)
    soup.append(div)

mytext = "My text with a few \n newlines \n"
mytext2 = "Some other text \n with a few more \n newlines \n here"

soup = BeautifulSoup("", )
replace_newline_with_br(mytext, soup)
replace_newline_with_br(mytext2, soup)
print soup.prettify()     

打印:

<div>
 My text with a few
 <br/>
 newlines
 <br/>
</div>
<div>
 Some other text
 <br/>
 with a few more
 <br/>
 newlines
 <br/>
 here
</div>