数据如下所示:
Idx得分组
5 0.85欧洲
8 0.77澳大利亚
12 0.70 S.America
13 0.71澳大利亚
42 0.82欧洲
45 0.90亚洲
65 0.91亚洲
73 0.72 S.America
77 0.84亚洲
需要看起来像这样:
Idx得分组
65 0.91亚洲
77 0.84亚洲
45 0.73亚洲
12 0.87 S.America
73 0.72 S.America
5 0.85欧洲
42 0.82欧洲
8 0.83澳大利亚
13 0.71澳大利亚
了解亚洲的分数是多少,它向我展示了亚洲的所有分数,然后是第二高分的小组,依此类推?我需要在Python中执行此操作。它与按一个元素排序然后按另一个元素排序有很大不同。请帮忙。对不起,如果这个问题多余。我几乎不知道如何问它,更不用说搜索了它。
我把它作为字典,所以dict = {5:[0.85,Europe],8:[0.77,Australia] ......}我做了一个试图解析数据的函数:
< / p>
def sortResults(dict):
newDict = {}
for k,v in dict.items():
if v[-1] in newDict:
sorDic[v[-1]].append((k,float(v[0]),v[1]))
else:
newDict[v[-1]] = [(k,float(v[0]),v[1])]
for k in newDict.keys():
for resList in newDict[k]:
resList = sorted(resList,key=itemgetter(1),reverse=True)
return sorDic
它说浮动是不可取消的......我只是感到困惑。
答案 0 :(得分:2)
我只会填充每组最大的字典,然后按组最大值排序,然后按个别分数排序。像这样:
data = [
(5 , 0.85, "Europe"),
(8 , 0.77, "Australia"),
(12, 0.70, "S.America"),
(13, 0.71, "Australia"),
(42, 0.82, "Europe"),
(45, 0.90, "Asia"),
(65, 0.91, "Asia"),
(73, 0.72, "S.America"),
(77, 0.84, "Asia")
]
maximums_by_group = dict()
for indx, score, group in data:
if group not in maximums_by_group or maximums_by_group[group] < score:
maximums_by_group[group] = score
data.sort(key=lambda e: (maximums_by_group[e[2]], e[1]), reverse=True)
for indx, score, group in data:
print indx, score, group
这会产生
的预期输出65 0.91 Asia
77 0.84 Asia
45 0.73 Asia
12 0.87 S.America
73 0.72 S.America
5 0.85 Europe
42 0.82 Europe
8 0.83 Australia
13 0.71 Australia
答案 1 :(得分:0)
我认为迭代比我在这里有更好的方法,但这有效:
from operator import itemgetter
dataset = [
{ 'idx': 5, 'score': 0.85, 'group': 'Europe' },
{ 'idx': 8, 'score': 0.77, 'group': 'Australia' },
{ 'idx': 12, 'score': 0.70, 'group': 'S.America' },
{ 'idx': 13, 'score': 0.71, 'group': 'Australia' },
{ 'idx': 42, 'score': 0.82, 'group': 'Europe' },
{ 'idx': 45, 'score': 0.90, 'group': 'Asia' },
{ 'idx': 65, 'score': 0.91, 'group': 'Asia' },
{ 'idx': 73, 'score': 0.72, 'group': 'S.America' }
]
score_sorted = sorted(dataset, key=itemgetter('score'), reverse=True)
group_score_sorted = []
groups_completed = []
for score in score_sorted:
group_name = score['group']
if not group_name in groups_completed:
groups_completed.append(group_name)
for group in score_sorted:
if group['group'] = group_name:
group_score_sorted.append(group)
#group_score_sorted now contains sorted list
答案 2 :(得分:0)
我认为最简单的方法是首先按组分开,然后分两步进行排序(首先对组进行最大排序,对组内进行第二次排序)。
data = [[ 5, 0.85, "Europe"],
[ 8, 0.77, "Australia"],
[12, 0.70, "S.America"],
[13, 0.71, "Australia"],
[42, 0.82, "Europe"],
[45, 0.90, "Asia"],
[65, 0.91, "Asia"],
[73, 0.72, "S.America"],
[77, 0.84, "Asia"]]
groups = {}
for idx, score, group in data:
try:
groups[group].append((idx, score, group))
except KeyError:
groups[group] = [(idx, score, group)]
for group in sorted((group for group in groups.keys()),
key = lambda g : -max(x[1] for x in groups[g])):
for idx, score, group in sorted(groups[group], key = lambda g : -g[1]):
print idx, score, group
最终结果是
65 0.91 Asia
45 0.9 Asia
77 0.84 Asia
5 0.85 Europe
42 0.82 Europe
8 0.77 Australia
13 0.71 Australia
73 0.72 S.America
12 0.7 S.America
与您提供的内容不同,但对于您提问中的结果,我认为您输入了错误,因为0.87
的得分S.America
不存在于输入数据中。
答案 3 :(得分:0)
from itertools import groupby, imap
from operator import itemgetter
def sort_by_max(a_list):
index, score, group = imap(itemgetter, xrange(3))
a_list.sort(key=group)
max_index = dict(
(each, max(imap(index, entries)))
for each, entries in groupby(a_list, group)
)
a_list.sort(key=lambda x:(-max_index[group(x)], -score(x)))
像这样使用:
the_list = [
[5, 0.85, 'Europe'],
[8, 0.77, 'Australia'],
[12, 0.87, 'S.America'],
[13, 0.71, 'Australia'],
[42, 0.82, 'Europe'],
[45, 0.90, 'Asia'],
[65, 0.91, 'Asia'],
[73, 0.72, 'S.America'],
[77, 0.84, 'Asia']
]
sort_by_max(the_list)
for each in the_list:
print '{0:2} : {1:<4} : {2}'.format(*each)
给出:
65 : 0.91 : Asia
45 : 0.9 : Asia
77 : 0.84 : Asia
12 : 0.87 : S.America
73 : 0.72 : S.America
5 : 0.85 : Europe
42 : 0.82 : Europe
8 : 0.77 : Australia
13 : 0.71 : Australia
[编辑]
考虑一下,我也喜欢defaultdict
和max
:
from collections import defaultdict
def sort_by_max(a_list):
max_index = defaultdict(int)
for index, score, group in a_list:
max_index[group] = max(index, max_index[group])
a_list.sort(key=lambda (index, score, group):(-max_index[group], -score))
答案 4 :(得分:0)
最简单的方法是将数据转储到列表中,因为python词典是未排序的。然后在python中使用本机timsort算法,该算法在排序期间保持运行或分组。
所以你的代码会是这样的:
data = [[ 5, 0.85, "Europe"],
[ 8, 0.77, "Australia"],
[12, 0.70, "S.America"],
[13, 0.71, "Australia"],
[42, 0.82, "Europe"],
[45, 0.90, "Asia"],
[65, 0.91, "Asia"],
[73, 0.72, "S.America"],
[77, 0.84, "Asia"]]
data.sort(key=lambda x: x[1], reverse=True)
data.sort(key=lambda x: x[2].upper())
这将产生:
[65, 0.91, 'Asia']
[45, 0.90, 'Asia']
[77, 0.84, 'Asia']
[8, 0.77, 'Australia']
[13, 0.71, 'Australia']
[5, 0.85, 'Europe']
[42, 0.82, 'Europe']
[73, 0.72, 'S.America']
[12, 0.70, 'S.America']