我试图从http://www.indiacom.com/yellow-pages/travel-agencies-and-services/中删除不同旅行社的名称和等级 这是我的代码
from bs4 import BeautifulSoup
import requests
url="http://www.indiacom.com/yellow-pages/travel-agencies-and-services/"
r=requests.get(url)
soup=BeautifulSoup(r.content)
links=soup.find_all("a")
#for link in links:
# if"http" in link.get("href"):
# print("<a href='%s'>%s</a>"%(link.get("href"),link.text))
L=[]
g_data=soup.find_all("div",{"class": "Info_listing"})
for item in g_data:
L.append(item.contents[3].text)
# L.append(item.text)
for index in L:
print(index)
#print(L[2])
我正在列表中保存名称和评级。现在我想根据评分进行排序。我该如何做到这一点,因为如果某人被评级显示他们的评级,但如果有人没有被评级,那么“成为第一个评级”那么我如何根据评级进行排序
答案 0 :(得分:3)
迭代列表,构建包含列表名称和评级的元组列表。使用sorted()
按评级值排序。将Be The First To Rate
视为0评级:
from operator import itemgetter
listings = []
for item in soup.select("div.Details_listing"):
name = item.a.text
rating = item.find('div', id='total_ratings_details').text
rating = 0 if rating.startswith('Be The First To Rate') else float(rating.split(' ')[0])
listings.append((name, rating))
print sorted(listings, key=itemgetter(1))
打印:
[
(u'Jasvinder Tours And Travels', 0),
...
(u'The Royal Tours & Travels', 2.9),
(u'Preeti Travels & Transport', 4.4)
]