添加"计数"列到具有特定条件的数据框

时间:2016-08-04 04:29:44

标签: r loops counting

我有一个不同帐户的数据框,赢或输记录。我想算一个人连续失去多少次。

df <- data.frame(account_number =c(1,1,1,1,1,1,1,2,2,2,2,2,3,3),
                 win_lose = c(-1,-1,-1,1,-1,-1,-1,-1,-1,1,1,1,1,-1))

> df
       account_number win_lose
1               1       -1
2               1       -1
3               1       -1
4               1        1
5               1       -1
6               1       -1
7               1       -1
8               2       -1
9               2       -1
10              2        1
11              2        1
12              2        1
13              3        1
14              3       -1

每个帐户代表一个人。最终结果应如下所示

           account_number win_lose   losing_streak
   1               1       -1             1
   2               1       -1             2
   3               1       -1             3
   4               1        1             0
   5               1       -1             1
   6               1       -1             2
   7               1       -1             3
   8               2       -1             1
   9               2       -1             2
   10              2        1             0
   11              2        1             0
   12              2        1             0
   13              3        1             0
   14              3       -1             1

1 个答案:

答案 0 :(得分:2)

一个选项是来自rleid的{​​{1}}。转换&#39; data.frame&#39;到&#39; data.table&#39; (data.table),按&#39; account_number setDT(df) rleid and&#39;分组,我们得到行序列(of 'win_lose)乘以&#39; win_lose&lt; 0&#39;这样所有的FALSE值都被强制转换为0并且通过乘法将为0,并且seq_len(.N)将被强制为1,我们通过乘以1得到序列值。

TRUE

library(data.table) setDT(df)[, losing_streak := seq_len(.N) * (win_lose <0) , by = .(account_number, rleid(win_lose))] df # account_number win_lose losing_streak # 1: 1 -1 1 # 2: 1 -1 2 # 3: 1 -1 3 # 4: 1 1 0 # 5: 1 -1 1 # 6: 1 -1 2 # 7: 1 -1 3 # 8: 2 -1 1 # 9: 2 -1 2 #10: 2 1 0 #11: 2 1 0 #12: 2 1 0 #13: 3 1 0 #14: 3 -1 1 选项将使用base R(针对分组依据)和ave

rle