从大熊猫交叉工具回到扁平系列

时间:2015-12-22 23:00:46

标签: python pandas

我在pandas DataFrame中有一个时间序列(具有两个IP地址的IP连接列表):

data=[["192.168.1.199","192.168.1.22"],["192.168.1.199","192.168.1.22"],["192.168.1.1","192.168.1.2"], ["192.168.1.1","192.168.1.2"],["192.168.1.1","192.168.1.2"],["192.168.1.1","192.168.1.3"],["192.168.1.1","192.168.1.99"]]
df = pd.DataFrame(data)
flows = pd.crosstab(df[0],df[1])

流是

1              192.168.1.2  192.168.1.22  192.168.1.3  192.168.1.99
0                                                                  
192.168.1.1              3             0            1             1
192.168.1.199            0             2            0             0 

但我想得到

        count         dstip         srcip
     0      3   192.168.1.2   192.168.1.1
     1      1   192.168.1.3   192.168.1.1
     2      1  192.168.1.99   192.168.1.1
     3      2  192.168.1.22  192.168.1.99

感谢您的帮助。目标是在绘图库中提供它

1 个答案:

答案 0 :(得分:2)

这有效:

s = df.groupby([1, 0])[1].count()
flows = pd.DataFrame({'count': s.values, 'dstip': s.index.droplevel(0),
                      'srcip': s.index.droplevel(1)})

>> flows

   count         dstip          srcip
0      3   192.168.1.2    192.168.1.1
1      2  192.168.1.22  192.168.1.199
2      1   192.168.1.3    192.168.1.1
3      1  192.168.1.99    192.168.1.1