想提取标题中提到的评论评分弹出评级百分比。 这里给出了html:
a class="a-link-normal" href="http://www.amazon.in/product-reviews/B01FM7GGFI/ref=cm_cr_dp_hist_one/261-4285111-5015802?ie=UTF8&filterByStar=one_star&reviewerType=all_reviews&showViewpoints=0" title="11% of reviews have 1 stars">1 star</a>
beautifulsoup python脚本:
from bs4 import BeautifulSoup
import requests
url = "http://www.amazon.in/Samsung-G-550FY-On5-Pro-Gold/dp/B01FM7GGFI/ref=lp_4363159031_1_1/261-4285111-5015802?s=electronics&ie=UTF8&qid=1503582445&sr=1-1"
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36'}
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.content, "lxml")
for link in soup.find_all("div", attrs={"class": "a-fixed-left-grid-col a-col-left"}):
for link1 in link.find_all("a", attrs={"class": "a-link-normal"}):
print(link1)
答案 0 :(得分:0)
html = '<a class="a-link-normal" href="http://www.amazon.in/product-reviews/B01FM7GGFI/ref=cm_cr_dp_hist_one/261-4285111-5015802?ie=UTF8&filterByStar=one_star&reviewerType=all_reviews&showViewpoints=0" title="11% of reviews have 1 stars">1 star</a>'
soup = BeautifulSoup(html, 'lxml')
a_tags = soup.find_all('a', class_='a-link-normal')
for a in a_tags:
if 'title' in a.attrs:
print(a['title'])