我将数据框分组以获得此结果:
+------+----+-------+
| Type | Nr | Class |
+------+----+-------+
| One | 01 | A1 |
| One | 01 | A2 |
| One | 01 | B1 |
| One | 02 | A1 |
| One | 02 | B1 |
| Two | 01 | A1 |
| Two | 01 | B1 |
| Two | 01 | B2 |
| Two | 02 | A1 |
+------+----+-------+
我执行了以下操作来确定每个类型的唯一Nr
值的数量:
DFGroup = df.groupby('Type')['Nr'].nunique().reset_index()
这很好用:
+------+----+
| Type | Nr |
+------+----+
| One | 2 |
| Two | 2 |
+------+----+
但现在我想向DFGroup
添加另一列,其中包含每个组的大小,如下所示:
+------+----+-------+
| Type | Nr | Count |
+------+----+-------+
| One | 2 | 5 |
| Two | 2 | 4 |
+------+----+-------+
我试过了:
DFGroup['Count'] = df.groupby('Type').size()
每个组只给我NAN
。
谢谢:)
答案 0 :(得分:1)
使用map
:
s = df.groupby('Type').size()
DFGroup = df.groupby('Type')['Nr'].nunique().reset_index()
DFGroup['new'] = DFGroup['Type'].map(s)
print (DFGroup)
Type Nr new
0 One 2 5
1 Two 2 4
两个函数使用agg
更好:
DFGroup = df.groupby('Type')['Nr'].agg([('Nr', 'nunique'),('Count','size')]).reset_index()
print (DFGroup)
Type Nr Count
0 One 2 5
1 Two 2 4