python Beautifulsoup计数元素没有内容

时间:2014-11-06 01:42:58

标签: python beautifulsoup

如何计算没有内容的元素? 通过说没有内容的元素,我的意思是<div class="myclass" id="myid"></div>

以下是我试图实现目标时编写的代码:

from bs4 import BeautifulSoup

html_doc = """
<dl>
    <dt class="details-row-7">Overall</dt>
    <dd id="c0r11" class=" alternate details-row-7">
        <div class="mobile-headings">Overall</div>
        <div class="mobile-value">
            <div class="ca-rating-star" data-size="1"><i class="icon-star icon-1x" style="color: #FF9900"></i>
                <i class="icon-star icon-1x" style="color: #FF9900"></i>
                <i class="icon-star icon-1x" style="color: #FF9900"></i>
                <i class="icon-star icon-1x" style="color: #FF9900"></i>
                <i class="icon-star-empty icon-1x" style="color: #FF9900"></i>
            </div>
        </div>
    </dd>
</dl>
"""

soup = BeautifulSoup(html_doc)
ele = soup.find("dd", {"id": "c0r11"}, {"class": "alternate details-row-7"})
if ele.find(text=False):
    con_str = ele.find("div", {"class":"mobile-value"})
    if con_str.find(text=False):
        star_ele = con_str.find("div", {"class":"ca-rating-star"})
        if star_ele.find(text=False):
            star = star_ele.find_all("i", {"class":"icon-star icon-1x"})
            i = 0
            for s in star:
                if s.find(text=False):
                    i += 1
            print(i)

但结果是0 .....

2 个答案:

答案 0 :(得分:1)

我在这里回答了你的问题。

https://gist.github.com/greatghoul/c2fab58e798a91a736a4

答案 1 :(得分:1)

问题是当您说<i>时,您正在寻找text=Falses.find(text=False)元素的子项,但<i>标记没有子项。您想查看<i>标记本身是否包含空文本。因此,请将s.find(text=False)替换为s.get_text() == ""

from bs4 import BeautifulSoup

html_doc = """
<dl>
    <dt class="details-row-7">Overall</dt>
    <dd id="c0r11" class=" alternate details-row-7">
        <div class="mobile-headings">Overall</div>
        <div class="mobile-value">
            <div class="ca-rating-star" data-size="1"><i class="icon-star icon-1x" style="color: #FF9900"></i>
                <i class="icon-star icon-1x" style="color: #FF9900"></i>
                <i class="icon-star icon-1x" style="color: #FF9900"></i>
                <i class="icon-star icon-1x" style="color: #FF9900"></i>
                <i class="icon-star-empty icon-1x" style="color: #FF9900"></i>
            </div>
        </div>
    </dd>
</dl>
"""

soup = BeautifulSoup(html_doc)
ele = soup.find("dd", {"id": "c0r11"}, {"class": "alternate details-row-7"})
if ele.find(text=False):
    con_str = ele.find("div", {"class":"mobile-value"})
    if con_str.find(text=False):
        star_ele = con_str.find("div", {"class":"ca-rating-star"})
        if star_ele.find(text=False):
            star = star_ele.find_all("i", {"class":"icon-star icon-1x"})
            i = 0
            for s in star:
                if s.get_text() == "": # CHANGE ON THIS LINE
                    i += 1
            print(i)