使用BeautifulSoup提取<div class =“”>标记之间存在的值

时间:2017-11-07 12:54:47

标签: python beautifulsoup

在下面的HTML代码中,我需要提取字符串&#34; PMR / SMR Effectiveness&#34;出现在班级&#34; txtDetail&#34;。但是使用我的代码,它将获取所有文本元素。我已经在stackoverflow上看到了多个这样的示例,但没有任何对我有用。请让我知道我的代码中有什么问题。

soup = BeautifulSoup(driver.page_source,'html.parser')
for i in soup.find_all(class_ ='default yellowBorder Hclass ' ):
    if soup.find_all('div',attrs={"class" : "txtDetail"}):
        print(i.text)

预期产出:PMR / SMR有效性

实际输出:HPPMR / SMR有效性PMR / SMR有效性PMR / SMR有效性计算公式PMR / SMR按推荐频率进行/审查

HTML code:

<ul class="blockList" id="ulexceptionlist"><li class="default yellowBorder Hclass "><div class="blockCont"> <div class="yellowBlock dispBlock expPoints"><div class="iconException">H</div><div class="yellowSection">P</div></div><div class=" nextCont  textCont activeCont " onclick="loadMerticchart(this,&quot;drill1&quot;,&quot; AND m.MetricId=20&quot;,&quot;2&quot;,&quot;PMR/SMR Effectiveness&quot;,&quot;20&quot;,&quot;286&quot;,&quot;0&quot;,&quot;0&quot;)" descopereason=""><div class="txtDetail">PMR/SMR Effectiveness</div></div><div class="infoIcon " id="info1" data-hasqtip="4"></div><div class="col-xs-12 noPadding popupContainer"> <div class="col-md-12 dropdownContent pull-left"> <span class="clearfix"></span> <span class="pull-left content"><h6 class="tooltipTitle"> PMR/SMR Effectiveness </h6> <span class="tooltipCont"> PMR/SMR Effectiveness </span><br><br><h6 class="tooltipTitle"> Computation Formula </h6> <span class="tooltipCont"> PMR/SMR  is conducted/reviewed as per recommended frequency </span></span> </div></div></div></li>

1 个答案:

答案 0 :(得分:1)

因为您在.text变量i标记上调用了li。我稍微修改了你的代码:

for i in soup.find_all(class_ ='default yellowBorder Hclass ' ):
    divs = soup.find_all('div',attrs={"class" : "txtDetail"})
    for d in divs:
        print(d.text)