我在名称变量上使用groupby,我想根据字符串变量删除行的子集
如果该组中没有特定字符串,如何删除行子集?
user-id time event
0 AMsySZa--NczTTJGpJvNQTNri4nh 28/06/2016 11:18 CONVERSION
1 AMsySZa--NczTTJGpJvNQTNri4nh 28/06/2016 10:40 Organic Search
2 AMsySZa--NczTTJGpJvNQTNri4nh 26/06/2016 10:06 Organic Search
3 AMsySZa--NczTTJGpJvNQTNri4nh 07/05/2016 19:27 Display
4 AMsySZa-05iZ7gKLfZQ3_kw8l-mO 16/06/2016 09:05 CONVERSION
5 AMsySZa-05iZ7gKLfZQ3_kw8l-mO 21/06/2016 09:10 CONVERSION
6 AMsySZa-15QiL8a5kcw9LvAtBLiE 29/06/2016 20:35 Display
7 AMsySZa-15QiL8a5kcw9LvAtBLiE 27/05/2016 06:46 Display
8 AMsySZa-15QiL8a5kcw9LvAtBLiE 15/05/2016 00:17 Display
9 AMsySZa-15QiL8a5kcw9LvAtBLiE 15/05/2016 00:17 Display
我想要删除任何没有单词"转换"的子集。我的预期输出是:
user-id time event
0 AMsySZa--NczTTJGpJvNQTNri4nh 28/06/2016 11:18 CONVERSION
1 AMsySZa--NczTTJGpJvNQTNri4nh 28/06/2016 10:40 Organic Search
2 AMsySZa--NczTTJGpJvNQTNri4nh 26/06/2016 10:06 Organic Search
3 AMsySZa--NczTTJGpJvNQTNri4nh 07/05/2016 19:27 Display
4 AMsySZa-05iZ7gKLfZQ3_kw8l-mO 16/06/2016 09:05 CONVERSION
5 AMsySZa-05iZ7gKLfZQ3_kw8l-mO 21/06/2016 09:10 CONVERSION
答案 0 :(得分:0)
您正在使用groupby
+ any
寻找过滤操作:
m = (df.event.str.contains('CONVERSION')
.groupby(df['user-id'])
.transform('any'))
df[m]
user-id time event
0 AMsySZa--NczTTJGpJvNQTNri4nh 28/06/2016 11:18 CONVERSION
1 AMsySZa--NczTTJGpJvNQTNri4nh 28/06/2016 10:40 Organic Search
2 AMsySZa--NczTTJGpJvNQTNri4nh 26/06/2016 10:06 Organic Search
3 AMsySZa--NczTTJGpJvNQTNri4nh 07/05/2016 19:27 Display
4 AMsySZa-05iZ7gKLfZQ3_kw8l-mO 16/06/2016 09:05 CONVERSION
5 AMsySZa-05iZ7gKLfZQ3_kw8l-mO 21/06/2016 09:10 CONVERSION