计算分类值并将结果列添加到现有数据框

时间:2015-10-26 12:50:39

标签: python pandas count group-by dataframe

我尝试计算每个会话的现有数据帧的不同时段的频率:

session       time        date      period
   1         05:51:53   2015-05-22  night
   1         05:52:59   2015-05-22  night
   1         06:08:24   2015-05-22  night
   1         06:09:06   2015-05-22  night
   1         08:25:31   2015-05-22  morning
   2         08:25:35   2015-05-22  morning
   2         08:26:37   2015-05-22  morning
   2         08:27:11   2015-05-22  morning
   2         12:33:17   2015-05-22  noon
   3         12:33:45   2015-05-22  noon

为了得到类似的东西:

session       time        date      period    frequency
   1         05:51:53   2015-05-22  night        4
   1         05:52:59   2015-05-22  night
   1         06:08:24   2015-05-22  night
   1         06:09:06   2015-05-22  night
   1         08:25:31   2015-05-22  morning      1
   2         08:25:35   2015-05-22  morning      3
   2         08:26:37   2015-05-22  morning
   2         08:27:11   2015-05-22  morning
   2         12:33:17   2015-05-22  noon         1
   3         12:33:45   2015-05-22  noon         1

我正在使用这种方法

 df['frequency'] = df.groupby('session', as_index=False)['period'].apply(lambda x: x.value_counts())

我有这个错误:TypeError: incompatible index of inserted column with frame index

如果我将.value_counts直接应用于groupby

 df['frequency'] = df.groupby('session', as_index=False)['period'].value_counts()

我有groupby方法没有属性value_counts

的错误

您能告诉我如何计算这些分类值并同时将结果列添加到现有数据框(我相信as_index=False管理此问题但显然不是)

1 个答案:

答案 0 :(得分:0)

您可以在groupby'session', 'period'找到群组的大小

In [19]: df['freq'] = df.groupby(['session', 'period'])['date'].transform(len)

In [20]: df
Out[20]:
   session      time        date   period freq
0        1  05:51:53  2015-05-22    night    4
1        1  05:52:59  2015-05-22    night    4
2        1  06:08:24  2015-05-22    night    4
3        1  06:09:06  2015-05-22    night    4
4        1  08:25:31  2015-05-22  morning    1
5        2  08:25:35  2015-05-22  morning    3
6        2  08:26:37  2015-05-22  morning    3
7        2  08:27:11  2015-05-22  morning    3
8        2  12:33:17  2015-05-22     noon    1
9        3  12:33:45  2015-05-22     noon    1