保持\ n在字符串内容中并写入一行

时间:2017-01-12 15:22:42

标签: python beautifulsoup

我有以下代码来解析一些HTML。我需要将输出(html结果)保存为单行代码,其中包含转义字符序列,例如export default Ember.Service.extend({ queryObj: {}, setRoomType(roomType) { this.set('queryObj.roomTypeId', roomType.id); }, setRoom(room) { this.set('queryObj.roomId', room.id); }}); ,但我要么得到一个我无法使用{{1因为单引号或输出正被写入多行(解释转义序列):

<div class="col-md-12">
  <div class="box box-primary">
    <div class="box-header with-border">
      <h3>Advanced Search</h3>
    </div>
    <div class="box-body">
      <div class="row">
        <div class="col-md-2">
          {{#power-select placeholder="Room type" allowClear=true selected=roomType options=roomTypes onchange=(action "setRoomType") as |roomType| }} {{roomType.name}} {{/power-select}}
        </div>
        <div class="col-md-2">
          {{#power-select placeholder="Room No" allowClear=true selected=room options=rooms onchange=(action "setRoom") as |room| }} {{room.name}} {{/power-select}}
        </div>
      </div>
    </div>
    <!-- /.box-body -->
  </div>
  <!-- /.box -->
</div>

我需要什么(包括转义序列):

\n

我的代码

repr()

以上是我的解决方案,但是对象表示不好,我需要字符串表示。我怎样才能做到这一点?

3 个答案:

答案 0 :(得分:2)

import bs4

html = '''<section class="prog__container">
 <span class="prog__sub">Title</span>
 <p>PEP 336 - Make None Callable</p>
 <span class="prog__sub">Description</span>
 <p>
 <p>
 <code>
      None
     </code>
     should be a callable object that when called with any
 arguments has no side effect and returns
     <code>
      None
     </code>
     .
    </p>
 </p>
 </section>'''
soup = bs4.BeautifulSoup(html, 'lxml')
str(soup)

出:

'<html><body><section class="prog__container">\n<span class="prog__sub">Title</span>\n<p>PEP 336 - Make None Callable</p>\n<span class="prog__sub">Description</span>\n<p>\n</p><p>\n<code>\n      None\n     </code>\n     should be a callable object that when called with any\n arguments has no side effect and returns\n     <code>\n      None\n     </code>\n     .\n    </p>\n</section></body></html>'

Document

中输出html代码有更复杂的方法

答案 1 :(得分:1)

为什么不使用repr

a = """this is the first line
this is the second line"""
print repr(a)

甚至(如果我清楚你的确切输出没有文字引号的问题)

print repr(a).strip("'")

输出:

'this is the first line\nthis is the second line'
this is the first line\nthis is the second line

答案 2 :(得分:0)

from bs4 import BeautifulSoup
import urllib.request

r = urllib.request.urlopen('https://www.example.com')
soup = BeautifulSoup(r.read(), 'html.parser')
html = str(soup)

这会将你的html作为一个字符串,并用\ n

分隔