Question

我有以下代码来解析一些HTML。我需要将输出（html结果）保存为单行代码，其中包含转义字符序列，例如export default Ember.Service.extend({ queryObj: {}, setRoomType(roomType) { this.set('queryObj.roomTypeId', roomType.id); }, setRoom(room) { this.set('queryObj.roomId', room.id); }});，但我要么得到一个我无法使用{{1因为单引号或输出正被写入多行（解释转义序列）：

<div class="col-md-12">
  <div class="box box-primary">
    <div class="box-header with-border">
      <h3>Advanced Search</h3>
    </div>
    <div class="box-body">
      <div class="row">
        <div class="col-md-2">
          {{#power-select placeholder="Room type" allowClear=true selected=roomType options=roomTypes onchange=(action "setRoomType") as |roomType| }} {{roomType.name}} {{/power-select}}
        </div>
        <div class="col-md-2">
          {{#power-select placeholder="Room No" allowClear=true selected=room options=rooms onchange=(action "setRoom") as |room| }} {{room.name}} {{/power-select}}
        </div>
      </div>
    </div>
    <!-- /.box-body -->
  </div>
  <!-- /.box -->
</div>

我需要什么（包括转义序列）：

\n

我的代码

repr()

以上是我的解决方案，但是对象表示不好，我需要字符串表示。我怎样才能做到这一点？

Answer 1

import bs4

html = '''<section class="prog__container">
 <span class="prog__sub">Title</span>
 <p>PEP 336 - Make None Callable</p>
 <span class="prog__sub">Description</span>
 <p>
 <p>
 <code>
      None
     </code>
     should be a callable object that when called with any
 arguments has no side effect and returns
     <code>
      None
     </code>
     .
    </p>
 </p>
 </section>'''
soup = bs4.BeautifulSoup(html, 'lxml')
str(soup)

出：

'<html><body><section class="prog__container">\n<span class="prog__sub">Title</span>\n<p>PEP 336 - Make None Callable</p>\n<span class="prog__sub">Description</span>\n<p>\n</p><p>\n<code>\n      None\n     </code>\n     should be a callable object that when called with any\n arguments has no side effect and returns\n     <code>\n      None\n     </code>\n     .\n    </p>\n</section></body></html>'

在Document

中输出html代码有更复杂的方法

Answer 2

为什么不使用repr？

a = """this is the first line
this is the second line"""
print repr(a)

甚至（如果我清楚你的确切输出没有文字引号的问题）

print repr(a).strip("'")

输出：

'this is the first line\nthis is the second line'
this is the first line\nthis is the second line

Answer 3

from bs4 import BeautifulSoup
import urllib.request

r = urllib.request.urlopen('https://www.example.com')
soup = BeautifulSoup(r.read(), 'html.parser')
html = str(soup)

这会将你的html作为一个字符串，并用\ n

分隔

保持\ n在字符串内容中并写入一行

3 个答案: