给了奇怪的空间

时间:2018-03-25 22:46:06

标签: python python-3.x beautifulsoup

我的网站看起来像这样:

<div>
  <p>Par&#65279;agra&#65279;ph 1</p> 
</div>

但是当我尝试用Python打印时:

for paragraph in div.find_all("p"):
  print(paragraph.text)

它是这样的:

  

Par agra ph 1

如何在不删除预期空间的情况下删除&#65279空格?

EDIT 这是我的代码

srcu = urllib.request.urlopen("url").read();
src = bs.BeautifulSoup(srcu, "lxml")

for paragraph in src.find_all("p"):
    a = pragraph.text
    print(a)



exit()

1 个答案:

答案 0 :(得分:0)

以下方法可行:

"I also like camels"

给你:

from bs4 import BeautifulSoup

html = """<div>
  <p>Par&#65279;agra&#65279;ph 1</p> 
</div>"""

soup = BeautifulSoup(html, 'html.parser')

for p in soup.find_all('p'):
    print(p.text.replace('\uFEFF', ''))