我有一个IP通信流,需要根据IP地址进行拆分和处理。我已经开始使用以下功能:
def address_streams(packet_stream):
addresses = packet_stream.Addr.unique()
for address in addresses:
print(address)
filter = packet_stream[(packet_stream.Addr == address)]
如何返回这些子集数据帧进行单独处理?
答案 0 :(得分:1)
我认为您需要处理GroupBy.apply
中的每个组:
def func(subdf):
print (subdf)
#add new value
subdf['new'] = 1
return subdf
packet_stream = packet_stream.groupby('Addr').apply(func)
编辑:每个组的循环使用:
for name, subdf in packet_stream.groupby('Addr'):
print (name, subdf)
EDIT1:要转换为组字典,请使用:
d = dict(tuple(df.groupby('Addr')))