请,我如何从下面粘贴的HTML代码中提取以下项目。
<div class="col-1">
<!-- previous close -->
<div class="section-quote-detail group">
<span class="detail-label">Previous Close</span>
<span class="detail-value">7.50</span>
</div>
<!-- open -->
<div class="section-quote-detail group">
<span class="detail-label">Open</span>
<span class="detail-value">7.50</span>
</div>
<!-- Volume (daily) -->
<div class="section-quote-detail group">
<span class="detail-label">Volume</span>
<span class="detail-value">11,393,304</span>
</div>
<!-- 3Month (90 day avg volume) -->
<div class="section-quote-detail group">
<span class="detail-label">3m Avg Volume</span>
<span class="detail-value">13,978,777</span>
</div>
<!-- Today's High -->
<div class="section-quote-detail group">
<span class="detail-label">Today’s High</span>
<span class="detail-value">7.80</span>
</div>
<!-- Today's Low -->
<div class="section-quote-detail group">
<span class="detail-label">Today’s Low</span>
<span class="detail-value">7.15</span>
</div>
</div>
答案 0 :(得分:1)
我从您的问题中了解到的是:给定一些文本(例如:打开),您想要查找与其关联的数字(例如:7.50)。
我的解决方案是先找到带有文本的span标签,然后再找到它的同级标签。
def getNumberGivenText(text):
pattern = re.compile(text)
#find the span tag with this text
span_tag = soup.findAll("span", text=pattern)[0]
#find it's next sibling
num_tag=span_tag.find_next_sibling()
#get the value
number=num_tag.string
return number
print(getNumberGivenText("Open")) #7.50
print(getNumberGivenText("Today’s Low")) #7.15
答案 1 :(得分:0)
尝试一下。基本上遍历标签并获取get_list
html = '''
<div class="col-1">
<!-- previous close -->
<div class="section-quote-detail group">
<span class="detail-label">Previous Close</span>
<span class="detail-value">7.50</span>
</div>
<!-- open -->
<div class="section-quote-detail group">
<span class="detail-label">Open</span>
<span class="detail-value">7.50</span>
</div>
<!-- Volume (daily) -->
<div class="section-quote-detail group">
<span class="detail-label">Volume</span>
<span class="detail-value">11,393,304</span>
</div>
<!-- 3Month (90 day avg volume) -->
<div class="section-quote-detail group">
<span class="detail-label">3m Avg Volume</span>
<span class="detail-value">13,978,777</span>
</div>
<!-- Today's High -->
<div class="section-quote-detail group">
<span class="detail-label">Today’s High</span>
<span class="detail-value">7.80</span>
</div>
<!-- Today's Low -->
<div class="section-quote-detail group">
<span class="detail-label">Today’s Low</span>
<span class="detail-value">7.15</span>
</div>
</div>'''
import bs4
soup = bs4.BeautifulSoup(html, 'html.parser')
data = soup.find_all('div', {'class':'section-quote-detail group'})
get_list = ['Open','Volume',"Today’s High","Today’s Low"]
results = pd.DataFrame()
for element in data:
if element.select('span.detail-label')[0].text in get_list:
label = element.select('span.detail-label')[0].text
value = element.select('span.detail-value')[0].text
temp_df = pd.DataFrame([[label, value]], columns = ['label', 'value'])
results = results.append(temp_df).reset_index(drop=True)