BeautifulSoup去除空格

时间:2018-11-22 05:02:50

标签: python python-3.x beautifulsoup

我正在通过网站开发基本的星座分析器。下面是我的代码:

import requests
from bs4 import BeautifulSoup as bs

url = "https://www.astrospeak.com/horoscope/capricorn"

response = requests.request("GET", url)

soup = bs(response.text, 'html.parser')

locater = soup.select("#sunsignPredictionDiv > div.fullDIV > div.lineHght18 > div")

quote = locater[0].previousSibling

这给我留下了以下<class 'bs4.element.NavigableString'>

"\n                      You are working towards yet another dream and as you pursue this vision there's no doubt in your mind that it will come to fruition. It's written in the stars! \n                      "

我正在努力如何在bs4.element.NavigableString上使用BeautifulSoup stripped_strings生成器。我最终想要得到的只是字符串You are working towards yet another dream and as you pursue this vision there's no doubt in your mind that it will come to fruition. It's written in the stars!

1 个答案:

答案 0 :(得分:2)

我知道评论中的答案几乎可以解决您的问题,但我希望为您提供一些背景知识:

import requests
from bs4 import BeautifulSoup as bs

url = "https://www.astrospeak.com/horoscope/capricorn"
response = requests.get(url)
soup = bs(response.text, 'html.parser')
locater = soup.select("#sunsignPredictionDiv > div.fullDIV > div.lineHght18 > div")

quote = locater[0].previousSibling.strip()

因此,从本质上讲,我仅使用request.get简化了语法,该文档也记录在请求文档中。并添加了.strip()strip用于删除所有空格,其中还包括换行符\n和制表符\t,它们以原始格式显示在字符串中。 strip()也可以用于删除前导和后缀 chars

还有lstrip()rstrip(),它们基本上分别转换为左前或右后空格,其作用相同。例如,如果您想了解更多信息,可以参考here