找出R中的连续分数

时间:2017-08-16 06:21:39

标签: r

我有这个数据框:

date <- structure(c(8664, 8808, 8819, 8899, 8995, 9002, 9006, 9025, 9054, 
9054, 9060, 9064, 9125, 9232, 9254, 9301, 9322, 9338, 9356, 9357, 
9364, 9369, 9369, 9370, 9372, 9372, 9376, 9376, 9376, 9388), class = "Date")

score <- c(2, 1, 1, 1, 2, 1, 2, 4, 2, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 
2, 2, 1, 1, 1, 2, 2, 2, 1, 1)

df <- data.frame(date, score)

我想找到每个分数的连续日期数,例如,分数1的最大连续日期是8个日期(参见第13-20行)。下面的数据帧是我需要的输出。我怎样才能实现这个输出?

#          date score streak
# 1  1993-09-21     2      1
# 2  1994-02-12     1      3
# 3  1994-02-23     1      3
# 4  1994-05-14     1      3
# 5  1994-08-18     2      1
# 6  1994-08-25     1      1
# 7  1994-08-29     2      1
# 8  1994-09-17     4      1
# 9  1994-10-16     2      1
# 10 1994-10-16     2      1
# 11 1994-10-22     1      1
# 12 1994-10-26     2      1
# 13 1994-12-26     1      8
# 14 1995-04-12     1      8
# 15 1995-05-04     1      8
# 16 1995-06-20     1      8
# 17 1995-07-11     1      8
# 18 1995-07-27     1      8
# 19 1995-08-14     1      8
# 20 1995-08-15     1      8
# 21 1995-08-22     2      2
# 22 1995-08-27     2      2
# 23 1995-08-27     1      3
# 24 1995-08-28     1      3
# 25 1995-08-30     1      3
# 26 1995-08-30     2      3
# 27 1995-09-03     2      3
# 28 1995-09-03     2      3
# 29 1995-09-03     1      2
# 30 1995-09-15     1      2

2 个答案:

答案 0 :(得分:3)

我们可以使用基础R rle并重复其length部分length次。

x <- rle(df$score)
df$streak <- rep(x$lengths, x$lengths)
df$streak

#[1] 1 3 3 3 1 1 1 1 2 2 1 1 8 8 8 8 8 8 8 8 2 2 3 3 3 3 3 3 2 2

其中x返回其重复的valueslength

x
#Run Length Encoding
#lengths: int [1:14] 1 3 1 1 1 1 2 1 1 8 ...
#values : num [1:14] 2 1 2 1 2 4 2 1 2 1 ...

答案 1 :(得分:1)

以下是使用rleid

data.table的选项
library(data.table)
setDT(df)[,  streak := .N, rleid(score)]
df
#          date score streak
# 1: 1993-09-21     2      1
# 2: 1994-02-12     1      3
# 3: 1994-02-23     1      3
# 4: 1994-05-14     1      3
# 5: 1994-08-18     2      1
# 6: 1994-08-25     1      1
# 7: 1994-08-29     2      1
# 8: 1994-09-17     4      1
# 9: 1994-10-16     2      2
#10: 1994-10-16     2      2
#11: 1994-10-22     1      1
#12: 1994-10-26     2      1
#13: 1994-12-26     1      8
#14: 1995-04-12     1      8
#15: 1995-05-04     1      8
#16: 1995-06-20     1      8
#17: 1995-07-11     1      8
#18: 1995-07-27     1      8
#19: 1995-08-14     1      8
#20: 1995-08-15     1      8
#21: 1995-08-22     2      2
#22: 1995-08-27     2      2
#23: 1995-08-27     1      3
#24: 1995-08-28     1      3
#25: 1995-08-30     1      3
#26: 1995-08-30     2      3
#27: 1995-09-03     2      3
#28: 1995-09-03     2      3
#29: 1995-09-03     1      2
#30: 1995-09-15     1      2