晚上,
我的数据:
display(dfRFQ_Breakdown_By_Done_Traded_Away_Grp.sort_values('security_type1',ascending=True))
state security_type1 count
0 Done CORP 239
4 Tied Done CORP 9
6 Tied Traded Away CORP 7
9 Traded Away CORP 1075
1 Done GOVT 40
5 Tied Done GOVT 2
7 Tied Traded Away GOVT 16
10 Traded Away GOVT 150
2 Done MTGE 4
8 Tied Traded Away MTGE 3
11 Traded Away MTGE 7
3 Done SUPRA 31
12 Traded Away SUPRA 88
我想将所有行分组为'完成'或者' Traded Away'为每个security_type1状态:
state security_type1 count
Done CORP 248
Traded Away CORP 1082
Done GOVT 42
Traded Away GOVT 166
Done MTGE 4
Traded Away MTGE 10
Done SUPRA 31
Traded Away SUPRA 88
我的代码:
# Updating any Tied Done to Done and Tied Traded Away to Traded Away
mask = (dfRFQ_Breakdown_By_Done_Traded_Away_Grp['state'].str.contains('Tied Done'))
dfRFQ_Breakdown_By_Done_Traded_Away_Grp.loc[mask, 'state'] = 'Done'
mask = (dfRFQ_Breakdown_By_Done_Traded_Away_Grp['state'].str.contains('Tied Traded Away'))
dfRFQ_Breakdown_By_Done_Traded_Away_Grp.loc[mask, 'state'] = 'Traded Away'
display(dfRFQ_Breakdown_By_Done_Traded_Away_Grp.sort_values('security_type1',ascending=True))
看来更新的字符串是由pandas单独分组的:
state security_type1 count
Done CORP 239
Done CORP 9
Traded Away CORP 7
Traded Away CORP 1075
Done GOVT 40
Done GOVT 2
Traded Away GOVT 16
Traded Away GOVT 150
Done MTGE 4
Traded Away MTGE 3
Traded Away MTGE 7
Done SUPRA 31
Traded Away SUPRA 88
对于大熊猫的反应是什么,没有将Done和Traded Away的实例结合在一起?我是否需要创建数据帧的另一个副本。它几乎像大熊猫在更新之前有一个旧值的链接。
答案 0 :(得分:1)
这似乎可以通过query
,groupby
和sort_values
:
res = df.query('(state == "Done") | (state == "TradedAway")')\
.groupby(['state', 'security_type1'], as_index=False)['count'].sum()\
.sort_values(['security_type1', 'state'])
print(res)
state security_type1 count
0 Done CORP 239
4 TradedAway CORP 1075
1 Done GOVT 40
5 TradedAway GOVT 150
2 Done MTGE 4
6 TradedAway MTGE 7
3 Done SUPRA 31
7 TradedAway SUPRA 88