Python Beautifulsoup:无法选择元素,尽管它在那里

时间:2017-05-03 02:22:56

标签: python beautifulsoup

我正在解析来自this页面的信息。如果向下滚动,您会在左侧找到能量分数部分66。我看到的HTML是这样的:

<div class="energyContentContainer" data-reactid="9"><div class="RadialMeter fair inline-block energyDataRadial" data-reactid="10"><div class="viz" data-reactid="11"><div class="circle" data-reactid="12"><div class="mask full" style="-webkit-transform:rotate(118.8deg);transform:rotate(118.8deg);" data-reactid="13"><div class="fill" style="-webkit-transform:rotate(118.8deg);transform:rotate(118.8deg);" data-reactid="14"></div></div><div class="mask half" data-reactid="15"><div class="fill" style="-webkit-transform:rotate(118.8deg);transform:rotate(118.8deg);" data-reactid="16"></div></div><div class="shadow" data-reactid="17"></div></div><div class="inset" data-reactid="18"></div></div><div class="RadialMeterContent" data-reactid="19"><div class="percentage" data-rf-test-name="ws-percentage" data-reactid="20">66</div><div class="iconAndLabel" data-reactid="21"><div class="label" data-reactid="22"><div class="flyoutLabelWrapper" data-reactid="23"><span class="linklessLabel" data-reactid="24">Energy Score</span><noscript data-reactid="25"></noscript><span data-rf-test-name="flyout-transition-group" data-reactid="26"></span></div></div></div></div></div><div class="energyComparisons" data-reactid="27"><span class="how-score-compares-text" data-reactid="28">How your energy score compares</span><div class="lockedSectionBanner" data-reactid="29"><div class="icon" data-reactid="30"><svg class="SvgIcon rfSvg secure" style="fill:black;height:24px;width:24px;" data-reactid="31"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="#secure"></use></svg></div><div class="message" data-reactid="32"><span data-reactid="33"><p data-reactid="34">Claim this home to unlock comparisons</p><div class="claimHomeButton notSubscribed unclaimed" data-reactid="35"><a href="#" class="Button med v3 tertiary-alt v3" data-name="Subscribe" tabindex="0" data-reactid="36"><span data-reactid="37">I'm the Owner</span></a></div></span></div></div></div><div class="rec-wrapper" data-reactid="38"><h4 class="rec-expenses" data-reactid="39"><!-- react-text: 40 -->This home spends an estimated <!-- /react-text --><span class="rec-expense-amount" data-reactid="41"><!-- react-text: 42 -->$<!-- /react-text --><!-- react-text: 43 -->74<!-- /react-text --></span><!-- react-text: 44 -->/month on energy. Want to lower your bill?<!-- /react-text --></h4><span class="Solar" data-reactid="45"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path fill="#585858" d="M12 5.4c-3.6 0-6.5 2.9-6.5 6.5s2.9 6.5 6.5 6.5 6.5-2.9 6.5-6.5c0-3.5-2.9-6.5-6.5-6.5zm0 2c2.5 0 4.6 2 4.6 4.6s-2 4.6-4.6 4.6-4.6-2-4.6-4.6S9.5 7.4 12 7.4zM11.9 0c-.4 0-.7.3-.7.6V4c0 .2.1.4.4.5.2.1.5.1.8 0 .2-.1.4-.3.4-.5V.6c0-.2-.1-.3-.2-.4s-.5-.2-.7-.2zM11.9 19.5c-.4 0-.7.3-.7.6v3.4c0 .2.1.4.4.5.2.1.5.1.8 0 .2-.1.4-.3.4-.5v-3.4c0-.2-.1-.3-.2-.4-.3-.2-.5-.2-.7-.2zM3.5 3.6c-.3.3-.3.7-.1.8l2.4 2.4c.1.2.3.2.6.1.2-.1.4-.3.5-.5s.1-.5-.1-.6L4.4 3.4c-.1-.1-.2-.2-.4-.1-.2 0-.4.1-.5.3zM17.2 17.4c-.2.3-.2.6 0 .8l2.4 2.4c.1.1.4.2.6.1s.4-.3.5-.5c.1-.2.1-.5-.1-.6l-2.4-2.4c-.1-.1-.2-.2-.4-.2s-.4.2-.6.4zM0 12.1c0 .4.3.7.6.7H4c.2 0 .4-.1.5-.4.1-.2.1-.5 0-.8-.1-.2-.3-.4-.5-.4H.6c-.2 0-.3.1-.4.2s-.2.5-.2.7zM19.5 12.1c0 .4.3.7.6.7h3.4c.2 0 .4-.1.5-.4.1-.2.1-.5 0-.8-.1-.2-.3-.4-.5-.4h-3.4c-.2 0-.3.1-.4.2-.2.3-.2.5-.2.7zM3.6 20.5c.3.3.7.3.9.1l2.4-2.4c.1-.1.1-.3 0-.6-.1-.2-.3-.4-.5-.5-.3-.1-.5-.1-.6.1l-2.4 2.4c-.1.1-.2.3-.1.5 0 .1.1.3.3.4zM17.4 6.8c.3.2.6.2.8 0l2.4-2.4c.1-.1.2-.4.1-.6-.1-.2-.3-.4-.5-.5-.2-.1-.5-.1-.6.1l-2.4 2.4c-.1.1-.2.2-.2.4s.2.4.4.6z"></path></svg></span><div class="rec-title" data-reactid="46">Consider Solar Panels For Your Roof</div><div class="rec-savings" data-reactid="47"><!-- react-text: 48 -->$<!-- /react-text --><!-- react-text: 49 -->1026<!-- /react-text --><!-- react-text: 50 --> savings/year<!-- /react-text --></div><div class="rec-description" data-reactid="51"><!-- react-text: 52 -->Even with upfront costs, solar power will save the average American around $20,000 over a 20 year period. Local and federal rebates can increase savings even more<!-- /react-text --><!-- react-text: 53 -->.<!-- /react-text --></div><a href="http://www.servicemagic.com/ext/32271520" class="Button  tertiary v3 rec-button" tabindex="0" target="_blank" rel="noopener noreferrer" data-reactid="54"><span data-reactid="55">Get a Quote</span></a></div></div>

如果我这样做,那么它就没有选择正确的元素。我试图检查查看源,发现这不是一个动态元素。

energy_section = soup.select('.energyContentContainer .RadialMeterContent')

然后返回[]

我做错了什么?

1 个答案:

答案 0 :(得分:0)

我认为你正确地调用了函数,你可能只是有错误的类名或错误的URL来测试。以下示例显示了您发布的样本数据:

>>> import bs4
>>> htmldoc = """<div class="RadialMeterContent" data-reactid="19"><div class="percentage" data-rf-test-name="ws-percentage" data-reactid="20">66</div><div class="iconAndLabel" data-reactid="21"><div class="label" data-reactid="22"><div class="flyoutLabelWrapper" data-reactid="23"><span class="linklessLabel" data-reactid="24">Energy Score</span><noscript data-reactid="25"></noscript><span data-rf-test-name="flyout-transition-group" data-reactid="26"></span></div></div></div></div>"""
>>> soup = bs4.BeautifulSoup(htmldoc, "html.parser")
>>> soup.select('.RadialMeterContent')
[<div class="RadialMeterContent" data-reactid="19"><div class="percentage" data-reactid="20" data-rf-test-name="ws-percentage">66</div><div class="iconAndLabel" data-reactid="21"><div class="label" data-reactid="22"><div class="flyoutLabelWrapper" data-reactid="23"><span class="linklessLabel" data-reactid="24">Energy Score</span><noscript data-reactid="25"></noscript><span data-reactid="26" data-rf-test-name="flyout-transition-group"></span></div></div></div></div>]
>>> soup.select('.RadialMeterContent .percentage')
[<div class="percentage" data-reactid="20" data-rf-test-name="ws-percentage">66</div>]

我无法找到任何文字实例&#34; energyContentContainer&#34;当我查看您发布的URL的来源时。我也无法看到能源评分部分。

好的,这是更新的HTML代码段的示例:

>>> htmldoc = """<div class="energyContentContainer" data-reactid="9"><div class="RadialMeter fair inline-block energyDataRadial" data-reactid="10"><div class="viz" data-reactid="11"><div class="circle" data-reactid="12"><div class="mask full" style="-webkit-transform:rotate(118.8deg);transform:rotate(118.8deg);" data-reactid="13"><div class="fill" style="-webkit-transform:rotate(118.8deg);transform:rotate(118.8deg);" data-reactid="14"></div></div><div class="mask half" data-reactid="15"><div class="fill" style="-webkit-transform:rotate(118.8deg);transform:rotate(118.8deg);" data-reactid="16"></div></div><div class="shadow" data-reactid="17"></div></div><div class="inset" data-reactid="18"></div></div><div class="RadialMeterContent" data-reactid="19"><div class="percentage" data-rf-test-name="ws-percentage" data-reactid="20">66</div><div class="iconAndLabel" data-reactid="21"><div class="label" data-reactid="22"><div class="flyoutLabelWrapper" data-reactid="23"><span class="linklessLabel" data-reactid="24">Energy Score</span><noscript data-reactid="25"></noscript><span data-rf-test-name="flyout-transition-group" data-reactid="26"></span></div></div></div></div></div><div class="energyComparisons" data-reactid="27"><span class="how-score-compares-text" data-reactid="28">How your energy score compares</span><div class="lockedSectionBanner" data-reactid="29"><div class="icon" data-reactid="30"><svg class="SvgIcon rfSvg secure" style="fill:black;height:24px;width:24px;" data-reactid="31"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="#secure"></use></svg></div><div class="message" data-reactid="32"><span data-reactid="33"><p data-reactid="34">Claim this home to unlock comparisons</p><div class="claimHomeButton notSubscribed unclaimed" data-reactid="35"><a href="#" class="Button med v3 tertiary-alt v3" data-name="Subscribe" tabindex="0" data-reactid="36"><span data-reactid="37">I'm the Owner</span></a></div></span></div></div></div><div class="rec-wrapper" data-reactid="38"><h4 class="rec-expenses" data-reactid="39"><!-- react-text: 40 -->This home spends an estimated <!-- /react-text --><span class="rec-expense-amount" data-reactid="41"><!-- react-text: 42 -->$<!-- /react-text --><!-- react-text: 43 -->74<!-- /react-text --></span><!-- react-text: 44 -->/month on energy. Want to lower your bill?<!-- /react-text --></h4><span class="Solar" data-reactid="45"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path fill="#585858" d="M12 5.4c-3.6 0-6.5 2.9-6.5 6.5s2.9 6.5 6.5 6.5 6.5-2.9 6.5-6.5c0-3.5-2.9-6.5-6.5-6.5zm0 2c2.5 0 4.6 2 4.6 4.6s-2 4.6-4.6 4.6-4.6-2-4.6-4.6S9.5 7.4 12 7.4zM11.9 0c-.4 0-.7.3-.7.6V4c0 .2.1.4.4.5.2.1.5.1.8 0 .2-.1.4-.3.4-.5V.6c0-.2-.1-.3-.2-.4s-.5-.2-.7-.2zM11.9 19.5c-.4 0-.7.3-.7.6v3.4c0 .2.1.4.4.5.2.1.5.1.8 0 .2-.1.4-.3.4-.5v-3.4c0-.2-.1-.3-.2-.4-.3-.2-.5-.2-.7-.2zM3.5 3.6c-.3.3-.3.7-.1.8l2.4 2.4c.1.2.3.2.6.1.2-.1.4-.3.5-.5s.1-.5-.1-.6L4.4 3.4c-.1-.1-.2-.2-.4-.1-.2 0-.4.1-.5.3zM17.2 17.4c-.2.3-.2.6 0 .8l2.4 2.4c.1.1.4.2.6.1s.4-.3.5-.5c.1-.2.1-.5-.1-.6l-2.4-2.4c-.1-.1-.2-.2-.4-.2s-.4.2-.6.4zM0 12.1c0 .4.3.7.6.7H4c.2 0 .4-.1.5-.4.1-.2.1-.5 0-.8-.1-.2-.3-.4-.5-.4H.6c-.2 0-.3.1-.4.2s-.2.5-.2.7zM19.5 12.1c0 .4.3.7.6.7h3.4c.2 0 .4-.1.5-.4.1-.2.1-.5 0-.8-.1-.2-.3-.4-.5-.4h-3.4c-.2 0-.3.1-.4.2-.2.3-.2.5-.2.7zM3.6 20.5c.3.3.7.3.9.1l2.4-2.4c.1-.1.1-.3 0-.6-.1-.2-.3-.4-.5-.5-.3-.1-.5-.1-.6.1l-2.4 2.4c-.1.1-.2.3-.1.5 0 .1.1.3.3.4zM17.4 6.8c.3.2.6.2.8 0l2.4-2.4c.1-.1.2-.4.1-.6-.1-.2-.3-.4-.5-.5-.2-.1-.5-.1-.6.1l-2.4 2.4c-.1.1-.2.2-.2.4s.2.4.4.6z"></path></svg></span><div class="rec-title" data-reactid="46">Consider Solar Panels For Your Roof</div><div class="rec-savings" data-reactid="47"><!-- react-text: 48 -->$<!-- /react-text --><!-- react-text: 49 -->1026<!-- /react-text --><!-- react-text: 50 --> savings/year<!-- /react-text --></div><div class="rec-description" data-reactid="51"><!-- react-text: 52 -->Even with upfront costs, solar power will save the average American around $20,000 over a 20 year period. Local and federal rebates can increase savings even more<!-- /react-text --><!-- react-text: 53 -->.<!-- /react-text --></div><a href="http://www.servicemagic.com/ext/32271520" class="Button  tertiary v3 rec-button" tabindex="0" target="_blank" rel="noopener noreferrer" data-reactid="54"><span data-reactid="55">Get a Quote</span></a></div></div>"""
>>> soup = bs4.BeautifulSoup(htmldoc, "html.parser")
>>> soup.select('.energyContentContainer .RadialMeterContent')
[<div class="RadialMeterContent" data-reactid="19"><div class="percentage" data-reactid="20" data-rf-test-name="ws-percentage">66</div><div class="iconAndLabel" data-reactid="21"><div class="label" data-reactid="22"><div class="flyoutLabelWrapper" data-reactid="23"><span class="linklessLabel" data-reactid="24">Energy Score</span><noscript data-reactid="25"></noscript><span data-reactid="26" data-rf-test-name="flyout-transition-group"></span></div></div></div></div>]

要获得所需的值,请尝试以下方法:

>>> soup.select_one('.energyContentContainer .RadialMeterContent .percentage').text
u'66'