如何对列表元素的值求和,哪些元素的值来自pandas中的数据框?

时间:2018-07-23 13:02:07

标签: python pandas

我有两个熊猫DataFrame,分别称为df1df2。我想对列表值来自df2的{​​{1}}中的列表值求和。

例如:

df1:

df1

和df2:

df1 = pd.DataFrame([['a',11],['b',13],['c',45],['d',88]],columns=['name1','data1'])
df1

    name1   data1
0      a       11
1      b       13
2      c       45
3      d       88

最后,我想要这个:

df2 = pd.DataFrame([['a',['b','c','d']],['b',['a','c']]],columns=['name2','data2'])
df2

    name2         data2
0      a      [b, c, d]
1      b         [a, c]

如何?非常感谢。

4 个答案:

答案 0 :(得分:3)

首先通过df1创建字典,然后使用get列出对dict的映射值的理解,如果将不匹配的值添加0sum

d = df1.set_index('name1')['data1'].to_dict()
df2['data2'] = [sum(d.get(y, 0) for y in x) for x in df2['data2']]
print (df2)

  name2  data2
0     a    146
1     b     56

如果可能要删除NaN,请使用filter with condition

df1 = pd.DataFrame([['a',11],['b',13],['c',45],['d',np.nan]],columns=['name1','data1'])
print (df1)
  name1  data1
0     a   11.0
1     b   13.0
2     c   45.0
3     d    NaN

df2 = pd.DataFrame([['a',['b','c','d']],['b',['a','c']]],columns=['name2','data2'])

d = df1.set_index('name1')['data1'].to_dict()
df2['data2'] = [sum(filter(lambda v: v==v, (d.get(y, 0) for y in x))) for x in df2['data2']]
print (df2)

  name2  data2
0     a   58.0
1     b   56.0

答案 1 :(得分:2)

也可以

d = dict(df1.values)
df2['s'] = df2.data2.transform(lambda v: pd.Series(v).map(d)).sum(1) 

0    146.0
1     56.0
dtype: float6

df2.data2.transform(lambda l: sum(d[i] for i in l))

0    146.0
1     56.0
dtype: float6

答案 2 :(得分:1)

您可以在pivot上使用df1将名称放入列,然后索引到df2

pivoted = df1.pivot(columns="name1").data1.sum()
df2.data2 = df2.data2.apply(lambda x: pivoted[x].sum())

  name2  data2
0     a  146.0
1     b   56.0

答案 3 :(得分:1)

您可以将result.id is "undefined" collections.defaultdict一起使用:

dict.__getitem__

对于较大的数据帧,这将比生成器表达式更有效:

from collections import defaultdict

d = defaultdict(int, df1.set_index('name1')['data1'].to_dict())

df2['sum'] = [sum(map(d.__getitem__, x)) for x in df2['data2']]

print(df2)

  name2      data2  sum
0     a  [b, c, d]  146
1     b  [a, c, e]   56