我有一个像下面这样的词典:
dict={idx1:{tokenA: 0.1,
tokenB: 1.3,
tokenD: 2.3},
idx2:{tokenC: 0.9,
tokenE: 3.4},
...
idxn:{tokenA: 0.3,
tokenF: 0.4,
...
tokenZ: 7.4}
}
每个索引可能有不同的标记/值,现在我想得到每个标记的平均值,简单如下:
{tokenA: average_value, tokenB: average_value, ... tokenZ: average_value)
任何有效的方法吗?提前谢谢!
答案 0 :(得分:1)
my_lists = defaultdict(list)
for key,val in my_dict.items():
for key2,val2 in val.items():
my_lists[key2].append(val2)
def average(key_val):
key,val = key_val
return (key, sum(val)*1.0/len(val))
print dict(map(average,my_lists))
答案 1 :(得分:1)
使用pandas:
import pandas
d = {'a': {'t1': 0.1,
't2': 0.2},
'b': {'t1': 0.1,
't3': 0.2}}
data = pandas.DataFrame(d)
data.T.mean()
=>
t1 0.1
t2 0.2
t3 0.2
dtype: float64
答案 2 :(得分:1)
d ={'idx1':{'tokenA': 0.1,
'tokenB': 1.3,
'tokenD': 2.3},
'idx2':{'tokenC': 0.9,
'tokenE': 3.4},
'idxn':{'tokenA': 0.3,
'tokenF': 0.4,
'tokenZ': 7.4}
}
from collections import Counter
token_sums = sum((Counter(v ) for k,v in d.iteritems()), Counter())
token_counts = sum((Counter(v.keys()) for k,v in d.iteritems()), Counter())
token_mean = {k:token_sums[k]/token_counts[k] for k in token_sums}
print token_mean
答案 3 :(得分:0)
import collections
d ={'idx1':{'tokenA' : 0.1,
'tokenB': 1.3,
'tokenD': 2.3},
'idx2':{'tokenC': 0.9,
'tokenE': 3.4},
'idxn':{'tokenA': 0.3,
'tokenF': 0.4,
'tokenZ': 7.4}
}
avg = collections.defaultdict(float)
count = collections.Counter()
for dat in d.itervalues():
for k,v in dat.iteritems():
avg[k] += v
count[k] += 1
for k,v in count.iteritems():
avg[k] /= count[k]
print avg