使用BeautifulSoup在ContentPane中查找没有标签的文本

时间:2018-09-11 21:46:25

标签: python html beautifulsoup

我的问题类似于以下问题:Get HTML Text that has no tagBeautiful Soup - Print a containers text without printing the text of the child elements

如何从ContentPane中获取此文本:Updated September 11, 2018 (57) Cases + (1) traffic w/contributing heroin

HTML:

<!--Container Content-->
<div class="contentmain">
    <div id="dnn_ctr3799_ContentPane" class="contentpane">
        <!--Start_Module_3799-->
        Updated September 11, 2018 (57) Cases + (1) traffic w/contributing heroin

尝试1 soup.find

我可以使用soup.find打印整个ContentPane,包括上面的文本,但是我不想要全部:

name_box = soup.find(id= 'dnn_ctr3799_ContentPane')
name = name_box.text.strip()
print name

尝试2 nextSibling

我尝试了nextSibling,但没有结果。

texts = soup.findAll("div", {"id":"dnn_ctr3799_ContentPane"})
for text in texts:
    if text.string:
        if "dnn_ctr3799_ContentPane" in text.string:
            print text.nextSibling.string.strip()

链接到网页:2018 Heroin/Fentanyl Overdose Deaths

1 个答案:

答案 0 :(得分:0)

原来是我正在处理的容器。我想要的字符串是父fm3 <- nls(y ~ a * exp(b/t), info, start = c(a = 1, b = 1)) fm4 <- nls(y ~ a * t^b, info, start = c(a = .001, b = 6)) 下的字符串sibling的{​​{1}}

答案:

<!--Start_Module_3799-->