您好我正在使用漂亮的汤库来解析html页面中的内容。
我使用以下脚本来获取我想要的页面部分:
review_list = soup.find(class_="review_list_score_breakdown_right")
<span class=" review_list_score_breakdown_right">
<ul class="review_score_breakdown_list list_tighten clearfix" data-et-view="bLTQHcXJVNRCSPOMcAQJO:1 bLTQHcXJVNRCSPOMcAQJO:3 " id="review_list_score_breakdown">
<li class="clearfix one_col" data-question="hotel_clean">
<p class="review_score_name">
Cleanliness
</p>
<div class="score_bar">
<div class="score_bar_value" data-score="100" style="width: 100%;">
</div>
</div>
<p class="review_score_value">
10
</p>
</li>
<li class="clearfix one_col" data-question="hotel_comfort">
<p class="review_score_name">
Comfort
</p>
<div class="score_bar">
<div class="score_bar_value" data-score="100" style="width: 100%;">
</div>
</div>
<p class="review_score_value">
10
</p>
</li>
<li class="clearfix one_col" data-question="hotel_services">
<p class="review_score_name">
Facilities
</p>
<div class="score_bar">
<div class="score_bar_value" data-score="100" style="width: 100%;">
</div>
</div>
<p class="review_score_value">
10
</p>
</li>
<li class="clearfix one_col" data-question="hotel_staff">
<p class="review_score_name">
Staff
</p>
<div class="score_bar">
<div class="score_bar_value" data-score="100" style="width: 100%;">
</div>
</div>
<p class="review_score_value">
10
</p>
</li>
<li class="clearfix one_col" data-question="hotel_value">
<p class="review_score_name">
Value for money
</p>
<div class="score_bar">
<div class="score_bar_value" data-score="100" style="width: 100%;">
</div>
</div>
<p class="review_score_value">
10
</p>
</li>
<li class="clearfix one_col" data-question="hotel_wifi">
<p class="review_score_name">
Free WiFi
</p>
<div class="score_bar">
<div class="score_bar_value" data-score="100" style="width: 100%;">
</div>
</div>
<p class="review_score_value">
10
</p>
</li>
<li class="clearfix one_col" data-question="hotel_location">
<p class="review_score_name">
Location
</p>
<div class="score_bar">
<div class="score_bar_value" data-score="100" style="width: 100%;">
</div>
</div>
<p class="review_score_value">
10
</p>
</li>
</ul>
</span>
&#13;
我需要从数据问题标签中提取分数。例如,如果我想知道酒店的舒适度分数,我需要访问data-question= "hotel_confort"
我已尝试使用find()
功能,但它无法正常工作。< / p>
答案 0 :(得分:0)
我认为您需要的是attrs
查找查询。
您的问题与Extracting an attribute value with beautifulsoup
我会根据你的情况说明一点。
review = soup.find(class_="review_list_score_breakdown_right")
input = review.find(attrs={"data-question" : "hotel-comfort"})
output = input['value']
自从我使用bs4以来已经有一段时间了,所以请调试代码。
编辑: 这是从您的示例字符串
中获取的一些工作代码review = soup.find('span', {'class' : "review_list_score_breakdown_right"})
input = review.find_all(attrs={"data-question": "hotel_comfort"})
print(input) #print the html extract which you can go down further.
答案 1 :(得分:0)
您的代码中没有hotel_confort
个。
review = soup.find(class_="review_list_score_breakdown_right")
hotel = review.find(attrs={"data-question" : "hotel_comfort"})
此代码返回
<li class="clearfix one_col" data-question="hotel_comfort"> ..... </li>