隔离bs4 / beautifulSoup中的属性

时间:2017-07-18 14:09:47

标签: python html parsing beautifulsoup bs4

我试图通过使用美丽的汤(bs4)来隔离一个列为属性的值。我已经列出了我的输出,但我不确定如何从字符串形式的“value”中获取字符串。

import requests
from bs4 import BeautifulSoup as bs

html = """
<div class="buttons">
    <form method="POST" action="/1/token/approve">
        <a class="button primary" href="/login?returnUrl=%2F1%2Fauthorize%3FrequestKey%3Df079a57f7157bf084676c5a9c3d0443e">Log in</a>
        <input type="submit" class="deny" value="Deny">

        <input type="hidden" name="requestKey" value="f079a57f7157bf084676c5a9c3d0443e">

        <!-- Need to pull this value -->
        <input type="hidden" name="signature" value="1500374930141/76d6e6bf4e95732eece754cc00315a242db0ffcf2758052c1fd64f2e6024611b">

    </form>
</div>
"""

#pull web page
f = requests.get(html)

# pass HTML to soup
soup = bs(f.text, "lxml")
bsIn = soup.find('input', attrs={'name':'signature'})

print (bsIn) # returns <input name="signature" type="hidden" value="1500387161323/9a240ffc8dfff875bc272f0defba27e58f4ffd8e7a29d00edc3528776bca3039"/>

1 个答案:

答案 0 :(得分:0)

您可以通过索引在美丽的汤中获取HTML / XML属性,例如:

print(bsIn['value'])

这将打印字符串:

'1500387161323/9a240ffc8dfff875bc272f0defba27e58f4ffd8e7a29d00edc3528776bca3039'

将打印如下:

1500387161323/9a240ffc8dfff875bc272f0defba27e58f4ffd8e7a29d00edc3528776bca3039