检索各种列表中元素的排名,以计算其排名分数Python的加权平均值

时间:2015-04-02 12:55:13

标签: python list position ranking sorted

我有两个已排序的词典,即它们现在表示为列表。 我想检索每个列表中每个元素的排名位置并将其存储在变量中,以便最终我可以计算两个列表中每个元素的排名分数的加权平均值。这是一个例子。

dict1 = {'class1': 15.17, 'class2': 15.95, 'class3': 15.95}

sorted_dict1 = [('class1', 15.17), ('class2', 15.95), ('class3', 15.95)]

sorted_dict2 = [('class2', 9.10), ('class3', 9.22), ('class1', 10.60)]

到目前为止,我可以检索列表中每个元素的排名位置并打印排名,但是当我尝试计算排名分数的加权平均值时,即[(w1 * a + w2 * b)/(w1 + w2) )],其中" a"是sorted_dict1和" b"中的排名位置是sorted_dict2中的排名位置,我得到的数字不是正确的加权平均数。

尝试了各种各样的事情,这里有一个:

for idx, val in list(enumerate(sorted_dict1, 1)):
    for idx1, val1 in list(enumerate(sorted_dict2, 1)):
         position_dict1 = idx
         position_dict2 = idx1
    weighted_average = float((0.50*position_dict1 + 0.25*position_dict2))/0.75     
    print weighted_average

我也没有考虑如果两个类在列表中排名相同会发生什么。我也很感激能得到任何提示/帮助。

我认为我可能需要创建一个函数来解决这个问题,但我也没有做到这一点。

任何帮助以及随附的解释代码的评论都会很棒。

所以我想计算列表中元素排名位置的加权平均值。例如加权平均值:

的Class1: weighted_average =((0.50 * 1)+(0.25 * 3))/ 0.75 = 1.5

等级2: 那么weighted_average =((0.50 * 2)+(0.25 * 1))/ 0.75 = 1.6666..7

谢谢!

1 个答案:

答案 0 :(得分:1)

我采用了简单的路线并给出了下一个整数等级的等分,因此class3class2sorted_dict1

中的等级为2
#!/usr/bin/env python

#Get the ranks for a list of (class, score) tuples sorted by score
#and return them in a dict
def get_ranks(sd):
    #The first class in the list has rank 1
    k, val = sd[0]
    r = 1
    rank = {k: r}

    for k, v in sd[1:]:
        #Only update the rank number if this value is 
        #greater than the previous
        if v > val:
            val = v
            r += 1
        rank[k] = r
    return rank

def weighted_mean(a, b):
    return (0.50*a + 0.25*b) / 0.75

sorted_dict1 = [('class1', 15.17), ('class2', 15.95), ('class3', 15.95)]
sorted_dict2 = [('class2', 9.10), ('class3', 9.22), ('class1', 10.60)]

print sorted_dict1
print sorted_dict2

ranks1 = get_ranks(sorted_dict1)
ranks2 = get_ranks(sorted_dict2)

print ranks1
print ranks2

keys = sorted(k for k,v in sorted_dict1)

print [(k, weighted_mean(ranks1[k], ranks2[k])) for k in keys]

<强>输出

[('class1', 15.17), ('class2', 15.949999999999999), ('class3', 15.949999999999999)]
[('class2', 9.0999999999999996), ('class3', 9.2200000000000006), ('class1', 10.6)]
{'class2': 2, 'class3': 2, 'class1': 1}
{'class2': 1, 'class3': 2, 'class1': 3}
[('class1', 1.6666666666666667), ('class2', 1.6666666666666667), ('class3', 2.0)]

在评论中我提到了使用自定义权重创建weighted_mean()函数的好方法。当然,我们可以只将权重作为weighted_mean()的附加参数传递,但这使得对weighted_mean()的调用比它需要的更混乱,使程序更难读取。

诀窍是使用一个将自定义权重作为参数并返回所需函数的函数。从技术上讲,这种功能制作功能称为closure

这是一个如何做到这一点的简短演示。

#!/usr/bin/env python

#Create a weighted mean function with weights w1 & w2
def make_weighted_mean(w1, w2):
    wt = float(w1 + w2)
    def wm(a, b):
        return (w1 * a + w2 * b) / wt
    return wm

#Make the weighted mean function
weighted_mean = make_weighted_mean(1, 2)

#Test
print weighted_mean(6, 3)
print weighted_mean(3, 9)

<强>输出

4.0
7.0

这是上面第一个程序的更新版本,它处理任意数量的sorted_dict列表。它使用原始的get_ranks()函数,但它使用比上例更复杂的闭包来对数据列表(或元组)进行加权平均。

#!/usr/bin/env python

''' Weighted means of ranks

    From https://stackoverflow.com/q/29413531/4014959

    Written by PM 2Ring 2015.04.03
'''

from pprint import pprint

#Create a weighted mean function with weights from list/tuple weights
def make_weighted_mean(weights):
    wt = float(sum(weights))
    #A function that calculates the weighted mean of values in seq 
    #weighted by the weights passed to make_weighted_mean()
    def wm(seq):
        return sum(w * v for w, v in zip(weights, seq)) / wt
    return wm


#Get the ranks for a list of (class, score) tuples sorted by score
#and return them in a dict
def get_ranks(sd):
    #The first class in the list has rank 1
    k, val = sd[0]
    r = 1
    rank = {k: r}

    for k, v in sd[1:]:
        #Only update the rank number if this value is 
        #greater than the previous
        if v > val:
            val = v
            r += 1
        rank[k] = r
    return rank


#Make the weighted mean function
weights = [0.50, 0.25]
weighted_mean = make_weighted_mean(weights)

#Some test data
sorted_dicts = [
    [('class1', 15.17), ('class2', 15.95), ('class3', 15.95), ('class4', 16.0)],
    [('class2', 9.10), ('class3', 9.22), ('class1', 10.60), ('class4', 11.0)]
]
print 'Sorted dicts:'
pprint(sorted_dicts, indent=4)

all_ranks = [get_ranks(sd) for sd in sorted_dicts]
print '\nAll ranks:'
pprint(all_ranks, indent=4)

#Get a sorted list of the keys
keys = sorted(k for k,v in sorted_dicts[0])
#print '\nKeys:', keys

means = [(k, weighted_mean([ranks[k] for ranks in all_ranks])) for k in keys]
print '\nWeighted means:'
pprint(means, indent=4)

<强>输出

Sorted dicts:
[   [   ('class1', 15.17),
        ('class2', 15.949999999999999),
        ('class3', 15.949999999999999),
        ('class4', 16.0)],
    [   ('class2', 9.0999999999999996),
        ('class3', 9.2200000000000006),
        ('class1', 10.6),
        ('class4', 11.0)]]

All ranks:
[   {   'class1': 1, 'class2': 2, 'class3': 2, 'class4': 3},
    {   'class1': 3, 'class2': 1, 'class3': 2, 'class4': 4}]

Weighted means:
[   ('class1', 1.6666666666666667),
    ('class2', 1.6666666666666667),
    ('class3', 2.0),
    ('class4', 3.3333333333333335)]

如果两个或多个类在列表中排名相同,那么这是get_ranks()的替代版本会跳过排名

def get_ranks(sd):
    #The first class in the list has rank 1
    k, val = sd[0]
    r = 1
    rank = {k: r}
    #The step size from one rank to the next. Normally 
    #delta is 1, but it's increased if there are ties.
    delta = 1

    for k, v in sd[1:]:
        #Update the rank number if this value is 
        #greater than the previous. 
        if v > val:
            val = v
            r += delta
            delta = 1
        #Otherwise, update delta
        else:
            delta += 1
        rank[k] = r
    return rank

以下是使用get_ranks()的备用版本的程序输出:

Sorted dicts:
[   [   ('class1', 15.17),
        ('class2', 15.949999999999999),
        ('class3', 15.949999999999999),
        ('class4', 16.0)],
    [   ('class2', 9.0999999999999996),
        ('class3', 9.2200000000000006),
        ('class1', 10.6),
        ('class4', 11.0)]]

All ranks:
[   {   'class1': 1, 'class2': 2, 'class3': 2, 'class4': 4},
    {   'class1': 3, 'class2': 1, 'class3': 2, 'class4': 4}]

Weighted means:
[   ('class1', 1.6666666666666667),
    ('class2', 1.6666666666666667),
    ('class3', 2.0),
    ('class4', 4.0)]