如何使用非唯一列将具有求和值的熊猫Groupby数据框映射到另一个数据框

时间:2019-02-11 14:52:10

标签: python pandas

我有两个熊猫数据框 df1 df2 。我需要通过在df1['seq']上进行分组并找到列df2的总和来找到df2['sum_column']。以下是示例数据和我当前的解决方案。

df1

id    code amount  seq
234     3    9.8    ?
213     3    18
241     3    6.4
543     3    2
524     2    1.8
142     2    14
987     2    11
658     3    17

df2

c_id  name role    sum_column
1     Aus  leader    6
1     Aus  client    1
1     Aus  chair     7
2     Ned  chair     8
2     Ned  leader    3
3     Mar  client    5
3     Mar  chair     2
3     Mar  leader    4

grouped = df2.groupby('c_id')['sum_column'].sum()
df3 = grouped.reset_index()

df3

c_id  sum_column
 1      14
 2      11
 3      11

遇到问题的下一步是将 df3 映射到 df1 并进行条件检查,以查看df1['amount']是否大于{{1 }}。

df3['sum_column']

打印出df1['seq'] = np.where(df1['amount'] > df1['code'].map(df3.set_index('c_id')[sum_column]), 1, 0) ,我只得到df1['code'].map(df3.set_index('c_id')['sum_column'])值。

有人知道这是怎么回事吗?

预期结果: df1

NaN

2 个答案:

答案 0 :(得分:3)

应简化解决方案,删除.reset_index()的{​​{1}}并将df3传递给Series

map

s = df2.groupby('c_id')['sum_column'].sum() df1['seq'] = np.where(df1['amount'] > df1['code'].map(s), 1, 0) True, False的布尔掩码转换为整数的替代方法:

1,0

df1['seq'] = (df1['amount'] > df1['code'].map(s)).astype(int)

答案 1 :(得分:1)

您忘记为select count (distinct(e.pkEnquiries)) from EnquiryDayFoodDrink edfd inner join EnquiryDay ed on ed.EnquiryDayId = edfd.EnquiryDayId inner join Enquiries e on ed.EnquiryId = e.pkEnquiries 添加报价了

sum_column