我有以下数据,我想找到' a'中的唯一值。并汇总来自''' c'的相应指数的数据。有关最佳方法的任何想法吗?我不知道从哪里开始。
a = ['x', 'y', 'z', 'z', 'x', 'w']
b = [ 1, 4, 5, 7, 9, 5]
c = [ 3, 6, 7, 8, 9, 7]
处理完毕后,
a = ['x', 'y', 'z', 'w']
b = [ 10, 4, 12, 5 ]
c = [ 12, 6, 15, 7 ]
答案 0 :(得分:4)
可以使用OrderedDict执行此类操作,因为您需要保持相同的顺序:
from collections import OrderedDict
a = ['x', 'y', 'z', 'z', 'x', 'w']
b = [ 1, 4, 5, 7, 9, 5]
c = [ 3, 6, 7, 8, 9, 7]
b_data = OrderedDict()
c_data = OrderedDict()
for letter, b_value, c_value in zip(a, b, c):
if letter in b_data:
b_data[letter] += b_value
c_data[letter] += c_value
else:
b_data[letter] = b_value
c_data[letter] = c_value
a = b_data.keys()
b = b_data.values()
c = c_data.values()
print(a)
print(b)
print(c)
输出:
['x', 'y', 'z', 'w']
[10, 4, 12, 5]
[12, 6, 15, 7]
答案 1 :(得分:2)
使用collections.defaultdict
:
from collections import defaultdict
a = ['x', 'y', 'z', 'z', 'x', 'w']
b = [ 1, 4, 5, 7, 9, 5]
c = [ 3, 6, 7, 8, 9, 7]
b_unique = collections.defaultdict(int)
c_unique = collections.defaultdict(int)
for k, bv, cv in zip(a,b,c):
b_unique[k] += bv
c_unique[k] += cv
答案 2 :(得分:2)
使用熊猫:
import pandas as pd
# To keep original order of values.
a_ordered = [val for idx, val in enumerate(a) if val not in a[:idx]]
# >>> a_ordered
# OUT: ['x', 'y', 'z', 'w']
df = pd.DataFrame({'a': a, 'b': b, 'c': c}).groupby('a').sum().T[a_ordered]
a = df.columns.tolist()
b, c = df.values.tolist()
>>> a
['x', 'y', 'z', 'w']
>>> b
[10, 4, 12, 5]
>>> c
[12, 6, 15, 7]
答案 3 :(得分:0)
感觉就像玩一些小码 - 高尔夫:
>>> from collections import OrderedDict
>>> od = OrderedDict()
>>> for t in zip(a,zip(b,c)):
... od[t[0]] = [i + x for i,x in zip(od.get(t[0],[0,0]), t[1])]
...
>>> od
OrderedDict([('x', [10, 12]), ('y', [4, 6]), ('z', [12, 15]), ('w', [5, 7])])
>>> a = list(od.keys())
>>> b,c = map(list,zip(*od.values()))
>>> a
['x', 'y', 'z', 'w']
>>> b
[10, 4, 12, 5]
>>> c
[12, 6, 15, 7]
>>>
答案 4 :(得分:0)
这是一个基本的方法
d1 = defaultdict(int)
d2 = defaultdict(int)
for x,y,z in zip(a,b,c):
d1[x] += y
d2[x] += z
an=list()
bn=list()
cn=list()
for k in sorted(d1.keys(), key=lambda x:a.index(x)):
an.append(k)
bn.append(d1[k])
cn.append(d2[k])
[an,bn,cn]
[['x', 'y', 'z', 'w'], [10, 4, 12, 5], [12, 6, 15, 7]]