多数及其在元组列表中的平均值

时间:2017-07-14 11:18:21

标签: python

假设以下元组列表代表来自3种不同方法的情绪估计:

[('pos', 0.2), ('neu', 0.1), ('pos', 0.4)]

我想知道找到多数情绪的最有效方法是什么,并且为了计算它的平均值,即:

result=('pos', 0.3)

由于

5 个答案:

答案 0 :(得分:2)

import itertools

l = [('pos', 0.2), ('neu', 0.1), ('pos', 0.4)]

您可以先根据情绪进行分组(注意它们需要先排序)

sentiments = [list(j[1]) for j in itertools.groupby(sorted(l), lambda i: i[0])]
# sentiments = [[('neu', 0.1)], [('pos', 0.2), ('pos', 0.4)]]

然后找出哪种情绪最常见(也就是说有最长的群体)

majority = max(sentiments, key=len)
# majority = [('pos', 0.2), ('pos', 0.4)]

然后最后计算平均值

values = [i[1] for i in majority]
average = (majority[0][0], sum(values)/len(values))
# average = ('pos', 0.30000000000000004)

答案 1 :(得分:1)

使用collectionsstatistics模块可以执行此操作:

from collections import Counter
from statistics import mean

lst = [('pos', 0.2), ('neu', 0.1), ('pos', 0.4)]
count = Counter(item[0] for item in lst)  # Counter({'pos': 2, 'neu': 1})
maj = count.most_common(1)[0][0]          # pos
mn = mean(item[1] for item in lst if item[0] == maj)
result = (maj, mn)

print(result)  # ('pos', 0.30000000000000004)

虽然鉴于您正在寻找效率我更喜欢CoryKramer's answer

答案 2 :(得分:0)

import collections

reports = [('pos', 0.2), ('neu', 0.1), ('pos', 0.4)]

oracle = collections.defaultdict(list)
for mood, score in reports:
    oracle[mood].append(score)

counts = {mood: len(scores) for mood, scores in oracle.items()}

mood = max(counts) # gives `'pos'`

sum(oracle[mood]) / len(oracle[mood]) # gives 0.3

答案 3 :(得分:0)

最好使用Dictionary。定义一个嵌套字典,其中包含键#39;是情感名称和值是一个字典,其中包含:'数字'(键),它是情绪值(值)和' count'(键)的情绪数量& #39; s出现(值)。例如:

sentiment['pos']['numbers'] = [0.2,0.4]
sentiment['pos']['count'] = 2
sentiment={'pos':{'numbers':[0.2,0.4],'count':2},'neu':{'numbers':`[0.1],'count:1'}}`

答案 4 :(得分:0)

sorted_tuples = sorted(my_tuple_list, key = lambda x : x[-1] , reverse = True)

majority_sentiment=  sorted_tuples[0][0]
majority_sentiment_score = 0
num_items = 0

for sentiment_tup in sorted_tuples:
    if sentiment_tup[0] == majority_sentiment:
        majority_sentiment_score+= sentiment_tup[1]
        num_items +=1

avg_sentiment_score = majority_sentiment_score/num_items

result= (majority_sentiment,avg_sentiment_score)

应该这样做。