我有3个列表,如下所示:
List 1 List 2 List 3
A A D
D D M
GE M A
G G S
M S G
S GE GE
现在,我需要通过平均列表中元素的排名来获得列表中元素的排名,如下所述:
Elements Rank-List1 Rank-List2 Rank-List3 Average Ranking
A 1 1 3 1.67 1
D 2 2 1 1.67 2
GE 3 6 6 5 5
G 4 4 5 4.33 4
M 5 3 2 3.33 3
S 6 5 4 5 6
如果“平均值”匹配,则将第一个元素选为较高的排名。
因此,最终的输出列表将为:
Output list
A
D
M
G
GE
S
平均值由Average = Sum of Rank (over all lists) / 3
计算得出:
( 1+1+3) / 3 = 1.67 # for A
这可以通过Python编程实现吗?
答案 0 :(得分:3)
使用key
函数的sorted
参数:
list1 = ['A', 'D', 'GE', 'G', 'M', 'S']
list2 = ['A', 'D', 'M', 'G', 'S', 'GE']
list3 = ['D', 'M', 'A', 'S', 'G', 'GE']
sorted(list1, key=lambda elem: sum([list1.index(elem), list2.index(elem), list3.index(elem)]) / 3)
或者,对于列表列表:
lists = [['A', 'D', 'GE', 'G', 'M', 'S'],
['A', 'D', 'M', 'G', 'S', 'GE'],
['D', 'M', 'A', 'S', 'G', 'GE']]
sorted(lists[0], key=lambda elem: sum(sublist.index(elem) for sublist in lists) / len(lists))
以上两种情况的输出:
['A', 'D', 'M', 'G', 'GE', 'S']
答案 1 :(得分:1)
您可以尝试这样。
>>> import numpy as np
>>> import pandas as pd
>>>
>>> elements = ["A", "D", "GE", "G", "M", "S"]
>>> rank_list1 = [1, 2, 3, 4, 5, 6]
>>> rank_list2 = [1, 2, 6, 4, 3, 5]
>>> rank_list3 = [3, 1, 6, 5, 2, 4]
>>>
>>> df = pd.DataFrame({
... "Elements": elements,
... "Rank-List1": rank_list1,
... "Rank-List2": rank_list2,
... "Rank-List3": rank_list3,
... })
>>>
>>> df
Elements Rank-List1 Rank-List2 Rank-List3
0 A 1 1 3
1 D 2 2 1
2 GE 3 6 6
3 G 4 4 5
4 M 5 3 2
5 S 6 5 4
>>>
>>> df["Average"] = df.apply(lambda s: s[1:].mean(), axis=1)
>>> df
Elements Rank-List1 Rank-List2 Rank-List3 Average
0 A 1 1 3 1.666667
1 D 2 2 1 1.666667
2 GE 3 6 6 5.000000
3 G 4 4 5 4.333333
4 M 5 3 2 3.333333
5 S 6 5 4 5.000000
>>>
>>> df["Average"] = df.apply(lambda s: s[1:].mean().round(2), axis=1)
>>> df
Elements Rank-List1 Rank-List2 Rank-List3 Average
0 A 1 1 3 1.67
1 D 2 2 1 1.67
2 GE 3 6 6 5.00
3 G 4 4 5 4.33
4 M 5 3 2 3.33
5 S 6 5 4 5.00
>>>
>>> out = df.sort_values(by="Average")
>>> out
Elements Rank-List1 Rank-List2 Rank-List3 Average
0 A 1 1 3 1.67
1 D 2 2 1 1.67
4 M 5 3 2 3.33
3 G 4 4 5 4.33
2 GE 3 6 6 5.00
5 S 6 5 4 5.00
>>>
>>> out.Elements
0 A
1 D
4 M
3 G
2 GE
5 S
Name: Elements, dtype: object
>>>
>>> out.Elements.tolist()
['A', 'D', 'M', 'G', 'GE', 'S']
>>>
答案 2 :(得分:1)
Tomothys solution的优化版本:
sorted(list1,key = lambda elem:sum([list1.index(elem),list2.index(elem),list3.index(elem)])/ 3)
为.index()
的每个元素调用list1
3次-每个调用都会迭代各自的列表(针对list1中的每个元素),直到找到出现为止-总的来说,您会得到类似{{1} }的三倍,即sum([1,2,3,4,5,6])
(而不是63
-见下文)。
我的解决方案的复杂度由18
决定,其中O(n)
-排序的复杂度可以忽略不计,因为它仅对所有列表中的n = sum(len(item) for item in data) => 18
个项目起作用,而列表要小得多。 Timsort complexity需要(最坏的情况)set()
,其中O(m*log(m))
m = set(i for sub in data for i in sub) => 6
输出:
from collections import defaultdict
data = [['A', 'D', 'GE', 'G', 'M', 'S'], ['A', 'D', 'M', 'G', 'S', 'GE'],
['D', 'M', 'A', 'S', 'G', 'GE']]
d = defaultdict(list) # or int and use /3.0 implicitly
# this loop touches each element once: O(n) n = sum(length of all lists)
for l in data:
for idx,value in enumerate(l):
d[value].append(idx)
# timsort: O(m) to O(m*log(m)) for the much shorter set() over emelents of all lists)
# sort by score:
result = sorted(d.items(), key= lambda x:sum(x[1])/float(len(x[1])))
print( *(r for r in result), sep="\n") # use 'r[0] for r ..' to just print the names
如果您保证每个子列表包含相同的元素-只是以不同的顺序,您可以进一步简化:
('A', [0, 0, 2])
('D', [1, 1, 0])
('M', [4, 2, 1])
('G', [3, 3, 4])
('GE', [2, 5, 5])
('S', [5, 4, 3])
输出:
d = defaultdict(int)
# this loop touches each element once: O(n)
for l in data:
for idx,value in enumerate(l):
d[value]+=idx
# there is no sense in dividing the sum by 3 if _all_ sums have to be devided by it
# sort by score:
result = sorted(d.items())
print( *(r for r in result), sep="\n")
('A', 2)
('D', 2)
('G', 10)
('GE', 12)
('M', 7)
('S', 12)
比普通命令更快-但是,如果您不喜欢导入,则可以针对较慢的速度进行更改
defaultdict
d = {}
d.setdefault(key, []).append(value) # defaultdict(list)
d.setdefault(key, 0) += value # defaultdict(int)
较慢,因为它总是构造需要花费时间的setdefault(key,default)
-defaultdict(...)已优化为不需要它,因此(略)快一些。