如何从Python中的一大块html代码中获取项目列表?

时间:2017-06-11 21:12:45

标签: python beautifulsoup

我正试图从dictionary.com的span标签的“data-syllable”部分获取一个音节列表(如下所示)。当我尝试用我的代码执行此操作时,它给了我一个没有填充的列表([None,None,None,None,None,None,None,None,None,None])并且我不知道如何修复它。请帮忙。

这是我想从

中获取数据音节的标签
[span class="dbox-bold">prob</span>, <span class="dbox-bold" data-
syllable="im·prob·a·bly, ">improbably, </span>, <span 
class="dbox-bold" data-syllable="im·prob·a·ble·ness, ">improbableness, 
</span>, <span class="dbox-bold" data-syllable="su·per·im·prob·a·ble, 
">superimprobable, </span>, <span class="dbox-bold" data-
syllable="su·per·im·prob·a·ble·ness, ">superimprobableness, </span>, 
<span class="dbox-bold" data-syllable="su·per·im·prob·a·bly, 
">superimprobably, </span>, <span class="dbox-bold">improbable</span>, 
<span class="dbox-bold" data-syllable="imˌprobaˈbility, 
">improbability, </span>, <span class="dbox-bold" data-syllable="im
ˈprobableness, ">improbableness, </span>, <span class="dbox-bold" data-
 syllable="imˈprobably, ">improbably, </span>]

=============================================== =========================

这是我的代码:

    a = [item for item in soup.find_all('span','dbox-bold')]
    find = [item.find(name='data-syllable') for item in a]
    return find

print(count_syllables('improbable'))

1 个答案:

答案 0 :(得分:0)

您正在寻找name属性。要查找data-syllable属性,您需要查看每个项目的attrs键。

find = [item.attrs.get('data-syllable', None) for item in a]

您现在可以在find中看到值。