我只是想用$来提取价格数据。文件中有多个价格,我只想要在class =“price price-label”>之后出现的价格。而不是那些在课后=“罢工”>
的人我粘贴了完整的代码 - 我正在提取信息表单file.txt - 我想要的输出是并排显示名称和价格。我之前没有使用过Beautiful Soup。
data-default-alt="Ford Truck"> </h3> </a> </div> <div class="tileInfo"> <div class="swatchesBox--empty"></div> <div class="promo-msg-text"> <span class="calloutMsg-promo-msg-text"></span> </div> <div class="pricecontainer" data-pricetype="Stand Alone"> <p id="price_206019013" class="price price-label "> $1,000.00 </p>
我的代码
with open("targetbubbles.txt") as str:
st = str.read()
#print st
import re
#brand=re.search('data-default-title=\"(.*?)" ',st)
#cost=re.search('\$(\d+,?\d*\.\d+)</p>',st)
答案 0 :(得分:1)
beautifulsoup是这种废话的有用模块
>>> import bs4
>>> s = ''' <p id="price_206019013" class="price price-label "> $2.84 </p> <p class="regularprice-label"> Reg. <span class="screen-reader-only"> price</span> <span class="strike"> $2.99 </span> </p> <div class="eyeBrow sale-msg"> <span '''
>>> soup = bs4.BeautifulSoup(s, 'lxml')
>>> soup.find_all('p', class_='price price-label ')
[<p class="price price-label " id="price_206019013"> $2.84 </p>]
>>> [result] = soup.find_all('p', class_='price price-label ')
>>> result.text.strip(' $')
u'2.84'