我有一个数据框,如下所示:
dtm f C A B
0 2018-03-01 00:00:00 +0000 50.135 9.000000 0 0
1 2018-03-01 00:00:01 +0000 50.130 9.000000 0 0
2 2018-03-01 00:00:02 +0000 50.120 9.000000 0 0
3 2018-03-01 00:00:03 +0000 50.112 9.000000 0 0
4 2018-03-01 00:00:04 +0000 50.102 9.000000 0 0
5 2018-03-01 00:00:05 +0000 50.097 9.000000 0 0
6 2018-03-01 00:00:06 +0000 11.095 9.000000 0 0
7 2018-03-01 00:00:07 +0000 11.095 9.000000 0 0
8 2018-03-01 00:00:08 +0000 11.092 9.000000 0 0
9 2018-03-01 00:00:09 +0000 11.095 9.000000 0 0
10 2018-03-01 00:00:10 +0000 11.097 5.000000 0 0
11 2018-03-01 00:00:11 +0000 11.097 5.000000 0 0
12 2018-03-01 00:00:12 +0000 11.097 5.000000 0 0
13 2018-03-01 00:00:13 +0000 50.100 5.000000 0 0
14 2018-03-01 00:00:14 +0000 50.102 5.000000 0 0
15 2018-03-01 00:00:15 +0000 50.105 5.000000 0 0
16 2018-03-01 00:00:16 +0000 50.102 5.000000 0 0
17 2018-03-01 00:00:17 +0000 50.102 5.000000 0 0
A和B是两个这样工作的计数器:
if((f> = 50)或(f <50&C <8))然后A增加1
如果f <50和C> 8,则B增加1
预期结果应为:
dtm f C A B
0 2018-03-01 00:00:00 +0000 50.135 9.000000 0 0
1 2018-03-01 00:00:01 +0000 50.130 9.000000 1 0
2 2018-03-01 00:00:02 +0000 50.120 9.000000 2 0
3 2018-03-01 00:00:03 +0000 50.112 9.000000 3 0
4 2018-03-01 00:00:04 +0000 50.102 9.000000 4 0
5 2018-03-01 00:00:05 +0000 50.097 9.000000 5 0
6 2018-03-01 00:00:06 +0000 11.095 9.000000 5 1
7 2018-03-01 00:00:07 +0000 11.095 9.000000 5 2
8 2018-03-01 00:00:08 +0000 11.092 9.000000 5 3
9 2018-03-01 00:00:09 +0000 11.095 9.000000 5 4
10 2018-03-01 00:00:10 +0000 11.097 5.000000 6 4
11 2018-03-01 00:00:11 +0000 11.097 5.000000 7 4
12 2018-03-01 00:00:12 +0000 11.097 5.000000 8 4
13 2018-03-01 00:00:13 +0000 50.100 5.000000 9 4
14 2018-03-01 00:00:14 +0000 50.102 5.000000 10 4
15 2018-03-01 00:00:15 +0000 50.105 5.000000 11 4
16 2018-03-01 00:00:16 +0000 50.102 5.000000 12 4
17 2018-03-01 00:00:17 +0000 50.102 5.000000 13 4
请注意,当A增加时B保持其值,反之亦然。他们不会重置。有什么想法吗?
提前谢谢!
答案 0 :(得分:5)
对我来说,很好地用sub
减去1
,并在第一行中删除了-1
,请添加clip_lower
:
m1 = (df.f >=50) | ((df.f<50) & (df.C<8))
m2 = (df.f<50) & (df.C>8)
df['A'] = m1.cumsum().sub(1).clip_lower(0)
df['B'] = m2.cumsum().sub(1).clip_lower(0)
答案 1 :(得分:5)
df.C > 8
原本是df.C >= 8
,因为这是对df.C < 8
(df.f < 50) & (df.C < 8)
并不是必需的,因为它的另一端是or
语句和df.f >= 50
。'A'
开头的列0
似乎很奇怪,需要特殊处理。假设它从零开始并在第一个True
assign
a = df.f.values >= 50
b = df.C.values < 8
c = a | b
df.assign(A=c.cumsum(), B=(~c).cumsum())
dtm f C A B
0 2018-03-01 00:00:00 +0000 50.135 9.0 1 0
1 2018-03-01 00:00:01 +0000 50.130 9.0 2 0
2 2018-03-01 00:00:02 +0000 50.120 9.0 3 0
3 2018-03-01 00:00:03 +0000 50.112 9.0 4 0
4 2018-03-01 00:00:04 +0000 50.102 9.0 5 0
5 2018-03-01 00:00:05 +0000 50.097 9.0 6 0
6 2018-03-01 00:00:06 +0000 11.095 9.0 6 1
7 2018-03-01 00:00:07 +0000 11.095 9.0 6 2
8 2018-03-01 00:00:08 +0000 11.092 9.0 6 3
9 2018-03-01 00:00:09 +0000 11.095 9.0 6 4
10 2018-03-01 00:00:10 +0000 11.097 5.0 7 4
11 2018-03-01 00:00:11 +0000 11.097 5.0 8 4
12 2018-03-01 00:00:12 +0000 11.097 5.0 9 4
13 2018-03-01 00:00:13 +0000 50.100 5.0 10 4
14 2018-03-01 00:00:14 +0000 50.102 5.0 11 4
15 2018-03-01 00:00:15 +0000 50.105 5.0 12 4
16 2018-03-01 00:00:16 +0000 50.102 5.0 13 4
17 2018-03-01 00:00:17 +0000 50.102 5.0 14 4
a = df.f.values >= 50
b = df.C.values < 8
c = a | b
df[['A', 'B']] = np.column_stack([c, ~c]).cumsum(0)
df
c = (df.f.values >= 50) | (df.C.values < 8)
df.assign(A=c.cumsum(), B=(~c).cumsum())
a = df.f.values >= 50
b = df.C.values < 8
c0 = a | b
c1 = ~c0
c0[0] = False
c1[0] = False
df.assign(A=c0.cumsum(), B=c1.cumsum())
dtm f C A B
0 2018-03-01 00:00:00 +0000 50.135 9.0 0 0
1 2018-03-01 00:00:01 +0000 50.130 9.0 1 0
2 2018-03-01 00:00:02 +0000 50.120 9.0 2 0
3 2018-03-01 00:00:03 +0000 50.112 9.0 3 0
4 2018-03-01 00:00:04 +0000 50.102 9.0 4 0
5 2018-03-01 00:00:05 +0000 50.097 9.0 5 0
6 2018-03-01 00:00:06 +0000 11.095 9.0 5 1
7 2018-03-01 00:00:07 +0000 11.095 9.0 5 2
8 2018-03-01 00:00:08 +0000 11.092 9.0 5 3
9 2018-03-01 00:00:09 +0000 11.095 9.0 5 4
10 2018-03-01 00:00:10 +0000 11.097 5.0 6 4
11 2018-03-01 00:00:11 +0000 11.097 5.0 7 4
12 2018-03-01 00:00:12 +0000 11.097 5.0 8 4
13 2018-03-01 00:00:13 +0000 50.100 5.0 9 4
14 2018-03-01 00:00:14 +0000 50.102 5.0 10 4
15 2018-03-01 00:00:15 +0000 50.105 5.0 11 4
16 2018-03-01 00:00:16 +0000 50.102 5.0 12 4
17 2018-03-01 00:00:17 +0000 50.102 5.0 13 4