我有一个像这样的DataFrame:
name visit foo
0 andrew BL a
1 andrew BL a
2 andrew BL b
3 andrew BL b
4 bob BL c
5 bob BL c
6 bob BL d
7 bob BL d
8 bob M12 e
9 bob M12 e
10 bob M12 f
11 bob M12 g
12 carol BL h
13 carol BL i
14 carol BL j
15 carol BL k
如何创建一个新列,列出每组foo
['name', 'visit']
组,如下所示?
name visit foo enum
0 andrew BL a 1
1 andrew BL a 1
2 andrew BL b 2
3 andrew BL b 2
4 bob BL c 1
5 bob BL c 1
6 bob BL d 2
7 bob BL d 2
8 bob M12 e 1
9 bob M12 e 1
10 bob M12 f 2
11 bob M12 g 3
12 carol BL h 1
13 carol BL i 2
14 carol BL j 3
15 carol BL k 4
答案 0 :(得分:2)
df['enum'] = df.groupby(['name', 'visit'])['foo'].transform(lambda x: pd.factorize(x)[0] + 1)
print (df)
name visit foo enum
0 andrew BL a 1
1 andrew BL a 1
2 andrew BL b 2
3 andrew BL b 2
4 bob BL c 1
5 bob BL c 1
6 bob BL d 2
7 bob BL d 2
8 bob M12 e 1
9 bob M12 e 1
10 bob M12 f 2
11 bob M12 g 3
12 carol BL h 1
13 carol BL i 2
14 carol BL j 3
15 carol BL k 4
答案 1 :(得分:1)
您可以修改coldspeed的评论以使用:
df = pd.concat([
df,
df.groupby([df.name, df.visit]).apply(lambda g: g.groupby('foo').ngroup() + 1).reset_index().rename(columns={0: 'enum'})['enum']],
axis=1)
>>> df
name visit foo enum
0 andrew BL a 1
1 andrew BL a 1
2 andrew BL b 2
3 andrew BL b 2
4 bob BL c 1
5 bob BL c 1
6 bob BL d 2
7 bob BL d 2
8 bob M12 e 1
9 bob M12 e 1
10 bob M12 f 2
11 bob M12 g 3
12 carol BL h 1
13 carol BL i 2
14 carol BL j 3
15 carol BL k 4