Question

我正在研究一个侧面项目，看看我是否可以预测网站上的胜利，但这是我第一次使用BeautifulSoup而且我不完全确定如何削减字符串大小。

这是代码，我基本上想要抓住它崩溃的地方的信息。

from bs4 import BeautifulSoup
from urllib import urlopen

html = urlopen('https://www.csgocrash.com/game/1/1287324').read()
soup = BeautifulSoup(html)

for section in soup.findAll('div',{"class":"row panel radius"}):
    crashPoint = section.findChildren()[2]
    print crashPoint

运行它后，我将其作为输出。

<p> <b>Crashed At: </b> 1.47x </p>

我想基本上只想抓取数字值，这需要我从双方切割，我只是不知道如何去做这个，以及删除HTML标签。

Answer 1

按文字查找Crashed At标签并获取下一个兄弟：

soup = BeautifulSoup(html, "html.parser")

for section in soup.findAll('div', {"class":"row panel radius"}):
    crashPoint = section.find("b", text="Crashed At: ").next_sibling.strip()
    print(crashPoint)  # prints 1.47x

此外，由于存在单个Crashed At值，因此不确定您是否需要循环：

from bs4 import BeautifulSoup
from urllib import urlopen

html = urlopen('https://www.csgocrash.com/game/1/1287324').read()
soup = BeautifulSoup(html, "html.parser")

section = soup.find('div', {"class":"row panel radius"})
crashPoint = section.find("b", text="Crashed At: ").next_sibling.strip()
print(crashPoint)

Python，BeautifulSoup - 提取字符串

1 个答案: