我想使用beautifulsoap获取优惠券代码时出现错误
这是页面的一部分:
<input id="text1" step="0.5" type="number" value="" name="text1">
这是我的代码:
<ul class="nc-nav__promo-modal--global-links"><div class="nc-nav__promo-modal--global-divider"></div>
<li><a href="/pk/c/womens_sale_events/35OffWearNowStyles?intcmp=seealloffersnav_1_spanstylefontweight500top035offourfavoritewearnowstylesonlineonlyusecodebstylefontweight700top0hisummerbspan"><div><span style="font-weight: 500; top: 0">35% off our favorite wear-now styles.* Online only. Use code <b style="font-weight: 700; top: 0">HISUMMER.</b></span></div></a><button type="button" class="nc-nav__promo-modal--global-details-button" aria-describedby="dialogDetailsBtn-0">Details</button></li>
<li><a href="/pk/c/womens_sale_events/35OffWearNowStyles?intcmp=seealloffersnav_1_spanstylefontweight500top035offourfavoritewearnowstylesonlineonlyusecodebstylefontweight700top0hisummerbspan"><div><span style="font-weight: 500; top: 0">35% off our favorite wear-now styles.* Online only. Use code <b style="font-weight: 700; top: 0">MAY20.</b></span></div></a><button type="button" class="nc-nav__promo-modal--global-details-button" aria-describedby="dialogDetailsBtn-0">Details</button></li>
</ul>
我收到以下错误:
def parse(self, response):
self.mongo.GetAllDocuments()
soup = BeautifulSoup(response.text, 'html.parser')
url,off,coupon,itemtype = "","","",""
containersC=soup.select(".nc-nav__promo-modal--global-links > li")
for itemC in containersC:
coupon = itemC.a.div.span.b.text
答案 0 :(得分:1)
您的代码假定所有实例的HTML结构均相同。如果缺少b
(或其他任何元素),则会收到该错误。一种方法是在尝试打印b
标签之前先对其进行测试,例如:
from bs4 import BeautifulSoup
html = """ <ul class="nc-nav__promo-modal--global-links"><div class="nc-nav__promo-modal--global-divider"></div>
<li><a href="/pk/c/womens_sale_events/35OffWearNowStyles?intcmp=seealloffersnav_1_spanstylefontweight500top035offourfavoritewearnowstylesonlineonlyusecodebstylefontweight700top0hisummerbspan"><div><span style="font-weight: 500; top: 0">35% off our favorite wear-now styles.* Online only. Use code <b style="font-weight: 700; top: 0">HISUMMER.</b></span></div></a><button type="button" class="nc-nav__promo-modal--global-details-button" aria-describedby="dialogDetailsBtn-0">Details</button></li>
<li><a href="/pk/c/womens_sale_events/35OffWearNowStyles?intcmp=seealloffersnav_1_spanstylefontweight500top035offourfavoritewearnowstylesonlineonlyusecodebstylefontweight700top0hisummerbspan"><div><span style="font-weight: 500; top: 0">35% off our favorite wear-now styles.* Online only. Use code <b style="font-weight: 700; top: 0">MAY20.</b></span></div></a><button type="button" class="nc-nav__promo-modal--global-details-button" aria-describedby="dialogDetailsBtn-0">Details</button></li>
</ul>"""
soup = BeautifulSoup(html, 'html.parser')
for li_tag in soup.select(".nc-nav__promo-modal--global-links > li"):
b_tag = li_tag.find('b')
if b_tag:
print(b_tag.text)
对于您的HTML,这给出了:
HISUMMER.
MAY20.