在Pandas中按键加入多个分组的数据帧

时间:2015-02-13 13:29:45

标签: python pandas

我一个接一个地完成了我想要的操作。他们给了我想要的结果。所以为了完成,我想通过我用来分组的相同密钥加入分组的数据帧。现在我得到一个“*** KeyError:'key'”,因为分组的数据帧没有允许我执行合并操作的“key”列。 我正在向熊猫迈出第一步。有没有办法将键列保留在groupby中,以便我可以执行合并或更简单的方法来实现所需输出中显示的相同结果?

import pandas as pd

raw_data = {
    'key': ['a', 'a', 'b', 'c', 'c', 'c', 'd', 'a', 'b', 'c'],
    'type_op': ['OP1', 'OP2', 'OP3', 'OP2', 'OP1', 'OP3', 'OP2', None, 'OP1', 'OP3'],
    'type_xp': ['XP2', 'XP2', None, 'XP3', 'XP1', None, 'XP1', 'XP3', None, 'XP3'],
}

df = pd.DataFrame.from_dict(raw_data)

total_op1 = df[df['type_op']=='OP1'][['type_op', 'key']].groupby('key').agg(['count'])
total_op2 = df[df['type_op']=='OP2'][['type_op', 'key']].groupby('key').agg(['count'])
total_op3 = df[df['type_op']=='OP3'][['type_op', 'key']].groupby('key').agg(['count'])
total_opn = df[df['type_op'].isnull()][['type_op', 'key']].groupby('key').agg(['count'])

total_xp1 = df[df['type_xp']=='XP1'][['type_xp', 'key']].groupby('key').agg(['count'])
total_xp2 = df[df['type_xp']=='XP2'][['type_xp', 'key']].groupby('key').agg(['count'])
total_xp3 = df[df['type_xp']=='XP3'][['type_xp', 'key']].groupby('key').agg(['count'])
total_xpn = df[df['type_xp'].isnull()][['type_xp', 'key']].groupby('key').agg(['count'])

dn = df.drop_duplicates(subset='key')['key']

pd.merge(dn, total_op1, on='key', how='outer')
# ...

# the output I want...

# key   TOP1    TOP2    TOP3    TOPN    TXP1    TXP2    TXP3    TXPN    TOTAL
# a     1       1       NaN     1       NaN     2       1       NaN     6
# b     1       1       1       NaN     1       NaN     NaN     2       6
# c     1       NaN     2       NaN     1       NaN     2       1       7
# d     NaN     1       NaN     NaN     NaN     NaN     NaN     NaN     1

0 个答案:

没有答案