很抱歉,如果问题不清楚,让我在这篇文章中描述我的问题。我有以下数据框:
value created_at t_diff flag_1
0 18.930542 2019-03-03 21:43:08-05:00 00:00:00 1
1 18.895210 2019-03-03 21:44:09-05:00 00:00:00 1
2 18.895210 2019-03-03 21:45:09-05:00 00:00:00 1
3 18.885010 2019-03-03 21:46:10-05:00 00:04:04 2
4 0.000000 2019-03-03 21:47:11-05:00 00:04:04 2
5 0.000000 2019-03-03 21:48:12-05:00 00:04:04 2
6 0.000000 2019-03-03 21:49:13-05:00 00:04:04 2
7 0.000000 2019-03-03 21:50:14-05:00 00:04:04 2
8 18.857025 2019-03-03 21:51:14-05:00 00:00:00 3
9 18.847290 2019-03-03 21:52:15-05:00 00:00:00 3
10 18.847290 2019-03-03 21:53:17-05:00 00:00:00 3
11 18.873283 2019-03-03 21:54:17-05:00 00:00:00 3
12 18.873283 2019-03-03 21:55:19-05:00 00:00:00 3
13 18.837677 2019-03-03 21:56:19-05:00 00:00:00 3
20 18.830170 2019-03-03 22:03:25-05:00 00:00:00 5
21 18.826149 2019-03-03 22:04:26-05:00 00:00:00 5
22 18.826149 2019-03-03 22:05:27-05:00 00:00:00 5
23 18.830795 2019-03-03 22:06:28-05:00 00:00:00 5
从“ flag_1”列中,我想确定尽管重复但仍形成连续数字的元素。我希望得到的结果如下所示:
value created_at t_diff flag_1 flag_2
0 18.930542 2019-03-03 21:43:08-05:00 00:00:00 1 1
1 18.895210 2019-03-03 21:44:09-05:00 00:00:00 1 1
2 18.895210 2019-03-03 21:45:09-05:00 00:00:00 1 1
3 18.885010 2019-03-03 21:46:10-05:00 00:04:04 2 1
4 0.000000 2019-03-03 21:47:11-05:00 00:04:04 2 1
5 0.000000 2019-03-03 21:48:12-05:00 00:04:04 2 1
6 0.000000 2019-03-03 21:49:13-05:00 00:04:04 2 1
7 0.000000 2019-03-03 21:50:14-05:00 00:04:04 2 1
8 18.857025 2019-03-03 21:51:14-05:00 00:00:00 3 1
9 18.847290 2019-03-03 21:52:15-05:00 00:00:00 3 1
10 18.847290 2019-03-03 21:53:17-05:00 00:00:00 3 1
11 18.873283 2019-03-03 21:54:17-05:00 00:00:00 3 1
12 18.873283 2019-03-03 21:55:19-05:00 00:00:00 3 1
13 18.837677 2019-03-03 21:56:19-05:00 00:00:00 3 1
20 18.830170 2019-03-03 22:03:25-05:00 00:00:00 5 2
21 18.826149 2019-03-03 22:04:26-05:00 00:00:00 5 2
22 18.826149 2019-03-03 22:05:27-05:00 00:00:00 5 2
23 18.830795 2019-03-03 22:06:28-05:00 00:00:00 5 2
每次连续重复的“成功”出现时,都应使用数字标识符填充名为“ flag_2”的列。 1表示第一个,2表示第二个,3表示第三个,依此类推。
我一直在尝试使用df.flag_1.unique()间接执行此操作,然后在more-itertools的帮助下创建了一个嵌套列表,该列表将循环遍历,使用isin from Pandas对数据帧进行切片
我想知道是否有一种方法可以使用Pandas来完成所有这些工作,而无需使用更多的Itertools和其余的方法。
能帮帮我吗?预先感谢!
答案 0 :(得分:1)
您可以使用server {
listen 80 default_server;
server_name pathtofulfill.com www.pathtofulfill.com;
root /var/www/pathtofulfill.com/;
gzip on;
gzip_min_length 1100;
gzip_buffers 4 32k;
gzip_types text/plain application/x-javascript text/xml text/css;
gzip_vary on;
set $lucee_context "pathtofulfill.com";
include lucee.conf;
}
和<Host name="pathtofulfill.com" appBase="webapps">
<Context path="" docBase="/var/www/pathtofulfill"/>
<Alias>pathtofulfill.com</Alias>
<Alias>www.pathtofulfill.com</Alias>
</Host>
来创建它,这里的逻辑是继续值,不同之处应该不大于1,在您的示例中,每次增加1或保持相同(没有变化,因此应为0)
diff