在pandas中堆叠具有公共列值的行

时间:2017-03-10 11:06:20

标签: python pandas

将具有相同time_req的行放在一起而不是根据ErrorCode进行分组的方法是什么。目前我正在获取所有0个ErrorCode报告,然后是所有Errocode 1报告,如下所示

>>> data.groupby([data['ErrorCode'], pd.Grouper(freq='15T')])['latency'].describe().unstack().reset_index()
    ErrorCode            Time_req  count           mean             std  \
0           0 2017-03-08 04:30:00      1  603034.000000             NaN   
1           0 2017-03-08 04:45:00      2  174720.000000    38101.741797   
2           0 2017-03-08 05:00:00      2  674942.500000   786118.185810   
3           0 2017-03-08 07:45:00     10  266653.200000   165867.496817   
4           0 2017-03-08 08:00:00     23  208949.304348   124902.942685   
5           0 2017-03-08 08:15:00     31  247282.064516   181780.519320   
6           0 2017-03-08 08:30:00     35  249332.857143   340084.918015   
7           0 2017-03-08 08:45:00      7  250066.000000   195051.871617   
8           1 2017-03-08 04:45:00      4  227747.500000   148185.181566   
9           1 2017-03-08 05:00:00      2  126633.000000     1337.846030   
10          1 2017-03-08 07:45:00     10  421781.900000   464249.118555   
11          1 2017-03-08 08:00:00     22  188122.272727    82110.336132   
12          1 2017-03-08 08:15:00     32  294896.968750   229498.560222   
13          1 2017-03-08 08:30:00     35  501679.628571  1353873.878385   
14          1 2017-03-08 08:45:00      6  531606.000000   582290.903396

但我需要像下面这样的替代

ErrorCode Time_req count
0 2017-03-08 04:30:00      1
1 NaN        NaN           NaN
0 2017-03-08 04:45:00      2
1 2017-03-08 04:45:00      4
AND SO ON

1 个答案:

答案 0 :(得分:1)

我认为您需要使用stack unstack来添加缺失值:

df = data.groupby([data['ErrorCode'], pd.Grouper(freq='15T')])['latency'].describe()
df = df.unstack(0).stack(dropna=False).unstack(1).reset_index()
print (df)
               Time_req  ErrorCode  count           mean           std
0   2017-03-08 04:30:00          0    1.0  603034.000000           NaN
1   2017-03-08 04:30:00          1    NaN            NaN           NaN
2   2017-03-08 04:45:00          0    2.0  174720.000000  3.810174e+04
3   2017-03-08 04:45:00          1    4.0  227747.500000  1.481852e+05
4   2017-03-08 05:00:00          0    2.0  674942.500000  7.861182e+05
5   2017-03-08 05:00:00          1    2.0  126633.000000  1.337846e+03
6   2017-03-08 07:45:00          0   10.0  266653.200000  1.658675e+05
7   2017-03-08 07:45:00          1   10.0  421781.900000  4.642491e+05
8   2017-03-08 08:00:00          0   23.0  208949.304348  1.249029e+05
9   2017-03-08 08:00:00          1   22.0  188122.272727  8.211034e+04
10  2017-03-08 08:15:00          0   31.0  247282.064516  1.817805e+05
11  2017-03-08 08:15:00          1   32.0  294896.968750  2.294986e+05
12  2017-03-08 08:30:00          0   35.0  249332.857143  3.400849e+05
13  2017-03-08 08:30:00          1   35.0  501679.628571  1.353874e+06
14  2017-03-08 08:45:00          0    7.0  250066.000000  1.950519e+05
15  2017-03-08 08:45:00          1    6.0  531606.000000  5.822909e+05