使用一个简单的请求,我试图从此html页面获取一些存储在“ alt”中的信息。问题在于,在每个实例中,信息都以“ img”开头的多行分开,当我尝试访问它时,我只能读取“ img”的第一个实例,而不能读取其余的实例,但是我我不确定该怎么做。这是HTML文本:
<div class="archetype-tile-description-wrapper">
<div class="archetype-tile-description">
<h2>
<span class="deck-price-online">
<a href="/archetype/standard-golgari-midrange-60634#online">Golgari Midrange</a>
</span>
<span class="deck-price-paper">
<a href="/archetype/standard-golgari-midrange-60634#paper">Golgari Midrange</a>
</span>
</h2>
<div class="manacost-container">
<span class="manacost">
<img alt="b" class="common-manaCost-manaSymbol sprite-mana_symbols_b" src="//assets1.mtggoldfish.com/assets/s-d69cbc552cfe8de4931deb191dd349a881ff4448ed3251571e0bacd0257519b1.gif" />
<img alt="g" class="common-manaCost-manaSymbol sprite-mana_symbols_g" src="//assets1.mtggoldfish.com/assets/s-d69cbc552cfe8de4931deb191dd349a881ff4448ed3251571e0bacd0257519b1.gif" />
</span>
</div>
<ul>
<li>Jadelight Ranger</li>
<li>Merfolk Branchwalker</li>
<li>Vraska's Contempt</li>
</ul>
</div>
</div>
话虽如此,我想从中得到的是“ b”和“ g”,并将它们存储在一个变量中。
答案 0 :(得分:0)
您可能可以像这样用类<img>
来抓取"common-manaCost-manaSymbol"
元素:
imgs = soup.find_all("img",{"class":"common-manaCost-manaSymbol"})
然后您可以遍历每个<img>
并获取其alt
属性。
alts = []
for i in imgs:
alts.append(i['alt'])
或具有列表理解
alts = [i['alt'] for i in imgs]