Python:将列添加到包含每个组大小的groupby-result

时间:2018-04-26 09:43:16

标签: python pandas pandas-groupby

我将数据框分组以获得此结果:

+------+----+-------+
| Type | Nr | Class |
+------+----+-------+
| One  | 01 | A1    |
| One  | 01 | A2    |
| One  | 01 | B1    |
| One  | 02 | A1    |
| One  | 02 | B1    |
| Two  | 01 | A1    |
| Two  | 01 | B1    |
| Two  | 01 | B2    |
| Two  | 02 | A1    |
+------+----+-------+

我执行了以下操作来确定每个类型的唯一Nr值的数量:

DFGroup = df.groupby('Type')['Nr'].nunique().reset_index()

这很好用:

+------+----+
| Type | Nr |
+------+----+
| One  |  2 |
| Two  |  2 |
+------+----+

但现在我想向DFGroup添加另一列,其中包含每个组的大小,如下所示:

+------+----+-------+
| Type | Nr | Count |
+------+----+-------+
| One  |  2 |     5 |
| Two  |  2 |     4 |
+------+----+-------+

我试过了:

DFGroup['Count'] = df.groupby('Type').size()

每个组只给我NAN

谢谢:)

1 个答案:

答案 0 :(得分:1)

使用map

s = df.groupby('Type').size()
DFGroup = df.groupby('Type')['Nr'].nunique().reset_index()
DFGroup['new'] = DFGroup['Type'].map(s)

print (DFGroup)
  Type  Nr  new
0  One   2    5
1  Two   2    4

两个函数使用agg更好:

DFGroup = df.groupby('Type')['Nr'].agg([('Nr', 'nunique'),('Count','size')]).reset_index()
print (DFGroup)

  Type  Nr  Count
0  One   2      5
1  Two   2      4