我有一个类似的数据框:
ip_src ip_dst ip_proto frame_time_delta payload_size
192.168.1.101 31.13.94.53 17.0 0.000000 172.0
31.13.94.53 192.168.1.101 17.0 0.006656 176.0
192.168.1.101 31.13.94.53 17.0 0.012948 172.0
然后,我使用:
使用了一些列来应用groupbyaggregation = {
'payload_size': {
'mean_payload_size': 'mean',
'std_payload_size': 'std',
'var_payload_size': 'var',
'max_payload_size': 'max',
'min_payload_size': 'min',
'quantity': 'count'
},
'frame_time_delta': {
'mean_frame_time_delta': 'mean',
'sd_frame_time_delta': 'std',
'var_frame_time_delta': 'var',
}
}
df = df.groupby(by=['ip_src', 'ip_dst'],as_index=False,).agg(aggregation)
但列名很糟糕,我的意思是,我明白了:
ip_src,ip_dst,payload_size,payload_size,payload_size,payload_size,payload_size,payload_size,frame_time_delta,frame_time_delta,frame_time_delta,.....
之后,我在聚合词典中指出了名字。
我该如何解决?
谢谢!
答案 0 :(得分:1)
由于不推荐使用agg中的字典重命名,我们可以创建多索引并将其展平为单向。
aggregation = {
'payload_size': [
'mean',
'std',
'var',
'max',
'min',
'count'
],
'frame_time_delta': [
'mean',
'std',
'var',
]
}
df_out = df.groupby(by=['ip_src', 'ip_dst']).agg(aggregation)
df_out.columns = df_out.columns.map('{0[1]}_{0[0]}'.format)
print(df_out.reset_index())
输出:
ip_src ip_dst mean_payload_size std_payload_size var_payload_size max_payload_size min_payload_size count_payload_size mean_frame_time_delta std_frame_time_delta var_frame_time_delta
0 192.168.1.101 31.13.94.53 172.0 0.0 0.0 172.0 172.0 2 0.006474 0.009156 0.000084
1 31.13.94.53 192.168.1.101 176.0 NaN NaN 176.0 176.0 1 0.006656 NaN NaN
如果您想进一步缩短列名,可以使用replace
:
aggregation = {
'payload_size': [
'mean',
'std',
'var',
'max',
'min',
'count'
],
'frame_time_delta': [
'mean',
'std',
'var',
]
}
df_out = df.groupby(by=['ip_src', 'ip_dst']).agg(aggregation)
df_out.columns = df_out.columns.map('{0[1]}_{0[0]}'.format)
df_out = df_out.rename(columns=lambda x: x.replace('payload_size','PLS').replace('frame_time_delta','FTD'))
print(df_out.reset_index())
输出:
ip_src ip_dst mean_PLS std_PLS var_PLS max_PLS min_PLS count_PLS mean_FTD std_FTD var_FTD
0 192.168.1.101 31.13.94.53 172.0 0.0 0.0 172.0 172.0 2 0.006474 0.009156 0.000084
1 31.13.94.53 192.168.1.101 176.0 NaN NaN 176.0 176.0 1 0.006656 NaN NaN