我正在创建一个新标记并为字符串指定换行符
from bs4 import BeautifulSoup
soup = BeautifulSoup("", "html.parser")
myTag = soup.new_tag("div")
myTag.string = "My text \n with a new line"
soup.insert(0, myTag)
,结果是
<div>My text
with a new line</div>
正如所料。但是,换行符需要<br>
标记才能正确呈现。
我怎样才能做到这一点?
答案 0 :(得分:2)
我认为在该div上将CSS white-space属性设置为pre-wrap
可能更好:
预先包装 - 浏览器会保留空白。文本将在必要时和换行符时进行换行。
一个例子:
<div style="white-space:pre-wrap"> Some \n text here </div>
在BeautifulSoup中执行此操作的代码:
myTag = soup.new_tag("div", style="white-space:pre-wrap")
myTag.string = "My text \n with a new line"
似乎替换\n
并不简单,因为BeautifulSoup默认会转义HTML实体。另一种方法是拆分输入字符串并使用文本和<br>
标记构建标记结构:
def replace_newline_with_br(s, soup):
lines = s.split('\n')
div = soup.new_tag('div')
div.append(lines[0])
for l in lines[1:]:
div.append(soup.new_tag('br'))
div.append(l)
soup.append(div)
mytext = "My text with a few \n newlines \n"
mytext2 = "Some other text \n with a few more \n newlines \n here"
soup = BeautifulSoup("", )
replace_newline_with_br(mytext, soup)
replace_newline_with_br(mytext2, soup)
print soup.prettify()
打印:
<div>
My text with a few
<br/>
newlines
<br/>
</div>
<div>
Some other text
<br/>
with a few more
<br/>
newlines
<br/>
here
</div>