user_login login_type login_time
0 a 0 14:00:00
1 b 0 08:20:03
2 c 1 09:10:03
3 b 1 10:49:03
4 a 1 11:19:03
5 a 1 12:29:03
6 c 0 13:39:03
7 c 1 14:49:03
我有df1
,我想找到user_login
的第二次出现,如果login_type
中的相应值是1,则将login_time
放入新的user_login login_type login_time 2nd_login_time
a 0 14:00:00 No 2nd login_time
b 0 8:20:03 No 2nd login_time
c 1 9:10:03 No 2nd login_time
b 1 10:49:03 10:49:03
a 1 11:19:03 11:19:03
a 1 12:29:03 No 2nd login_time
c 0 13:39:03 13:39:03
c 1 14:49:03 No 2nd login_time
柱。
最终结果如下:
HttpServletRequest request
如何在熊猫中实现这一目标?
答案 0 :(得分:0)
将cumcount
用于组中的值位置,并使用其他条件链接。最后按loc
设置新值:
m = (df.groupby('user_login').cumcount() == 1) & (df['login_type'] == 1)
df.loc[m, 'new'] = df['login_time']
print (df)
user_login login_type login_time new
0 a 0 14:00:00 NaN
1 b 0 08:20:03 NaN
2 c 1 09:10:03 NaN
3 b 1 10:49:03 10:49:03
4 a 1 11:19:03 11:19:03
5 a 1 12:29:03 NaN
6 c 0 13:39:03 NaN
7 c 1 14:49:03 NaN
如果想要设置两个值:
df['new'] = np.where(m, df['login_time'], 'No 2nd login_time')
print (df)
user_login login_type login_time new
0 a 0 14:00:00 No 2nd login_time
1 b 0 08:20:03 No 2nd login_time
2 c 1 09:10:03 No 2nd login_time
3 b 1 10:49:03 10:49:03
4 a 1 11:19:03 11:19:03
5 a 1 12:29:03 No 2nd login_time
6 c 0 13:39:03 No 2nd login_time
7 c 1 14:49:03 No 2nd login_time
详情:
print (df.groupby('user_login').cumcount())
0 0
1 0
2 0
3 1
4 1
5 2
6 1
7 2
dtype: int64
print (m)
0 False
1 False
2 False
3 True
4 True
5 False
6 False
7 False
dtype: bool