我有一个不同帐户的数据框,赢或输记录。我想算一个人连续失去多少次。
df <- data.frame(account_number =c(1,1,1,1,1,1,1,2,2,2,2,2,3,3),
win_lose = c(-1,-1,-1,1,-1,-1,-1,-1,-1,1,1,1,1,-1))
> df
account_number win_lose
1 1 -1
2 1 -1
3 1 -1
4 1 1
5 1 -1
6 1 -1
7 1 -1
8 2 -1
9 2 -1
10 2 1
11 2 1
12 2 1
13 3 1
14 3 -1
每个帐户代表一个人。最终结果应如下所示
account_number win_lose losing_streak
1 1 -1 1
2 1 -1 2
3 1 -1 3
4 1 1 0
5 1 -1 1
6 1 -1 2
7 1 -1 3
8 2 -1 1
9 2 -1 2
10 2 1 0
11 2 1 0
12 2 1 0
13 3 1 0
14 3 -1 1
答案 0 :(得分:2)
一个选项是来自rleid
的{{1}}。转换&#39; data.frame&#39;到&#39; data.table&#39; (data.table
),按&#39; account_number setDT(df)
rleid and
&#39;分组,我们得到行序列(of 'win_lose
)乘以&#39; win_lose&lt; 0&#39;这样所有的FALSE值都被强制转换为0并且通过乘法将为0,并且seq_len(.N)
将被强制为1,我们通过乘以1得到序列值。
TRUE
library(data.table)
setDT(df)[, losing_streak := seq_len(.N) * (win_lose <0) ,
by = .(account_number, rleid(win_lose))]
df
# account_number win_lose losing_streak
# 1: 1 -1 1
# 2: 1 -1 2
# 3: 1 -1 3
# 4: 1 1 0
# 5: 1 -1 1
# 6: 1 -1 2
# 7: 1 -1 3
# 8: 2 -1 1
# 9: 2 -1 2
#10: 2 1 0
#11: 2 1 0
#12: 2 1 0
#13: 3 1 0
#14: 3 -1 1
选项将使用base R
(针对分组依据)和ave
rle