Here is part of my code, how can I select the '3 sold' between the a tag at the bottom. Using beautiful soup.
<body>
<div>
<div class="u-flL qtyCntVal vi-bboxrev-posabs vi-bboxrev-dsplinline">
<div class="errorIcon" id="w1-11-_errIcon" style="display: none;"><!--
err_qty_icon -->
<img alt="Error icon" class="errorimg"
src="http://ir.ebaystatic.com/pictures/aw/pics/s.gif"></div><input
class="qtyInput" id="qtyTextBox" name="quantity" size="4" type="text"
value="1"> <span class="qtyTxt vi-bboxrev-dsplblk feedbackON" style=""><span
id="qtySubTxt"><span class="">9 available</span></span> <span class="vi-qty-
vert-algn vi-qty-slash">/</span> <span class="vi-qtyS vi-bboxrev-dsplblk vi-
qty-vert-algn vi-qty-pur-lnk"><a
href="http://offer.ebay.co.uk/ws/eBayISAPI.dll?
ViewBidsLogin&item=322646576920&rt=nc&_trksid=p2047675.l2564">3
sold</a></span></span>
</div>
答案 0 :(得分:2)
有很多方法可以达到所需的元素。严格来说,我们应该知道您正在使用的上下文 - 页面的完整HTML以及元素属性和结构的独特性。
也就是说,这是使用CSS selector基于span
元素的a
元素的类来获取所需文本的一种方法:
soup.select_one(".qtyTxt .vi-qtyS > a").get_text()
如果链接本身始终指向ebay
,您可以在选择器中另外检查:
soup.select_one(".qtyTxt .vi-qtyS > a[href*=ebay]").get_text()
答案 1 :(得分:0)
获取&#39; a&#39;的内容标签,其中文字是您的完整文字:
>>> soup = BeautifulSoup(text, 'html.parser')
>>> span = soup.findAll('a')[0].next
>>> span
u'3 \nsold'
>>> soup.findAll('a')[0]
<a href="http://offer.ebay.co.uk/ws/eBayISAPI.dll?\nViewBidsLogin&item=322646576920&rt=nc&_trksid=p2047675.l2564">3 \nsold</a>
>>>