按行查找连续值

时间:2015-01-27 21:50:47

标签: r

是否有一种聪明的方法可以确定行中是否有连续“YES”?

        1/01 1/02 1/03 1/04
UserA   Yes  Yes  Yes  Yes
UserB   No   Yes  No   No
UserC   Yes  No   Yes  Yes
UserD   Yes  No   Yes  No

UserA将有4个连续的是

UserB将为0

UserC将有2个连续的是

UserD将连续0为是

2 个答案:

答案 0 :(得分:4)

我假设你有一个data.frame d

d <- structure(list(X1.01 = c("Yes", "No", "Yes", "Yes"), X1.02 = c("Yes", 
"Yes", "No", "No"), X1.03 = c("Yes", "No", "Yes", "Yes"), X1.04 = c("Yes", 
"No", "Yes", "No")), .Names = c("X1.01", "X1.02", "X1.03", "X1.04"
), class = "data.frame", row.names = c("UserA", "UserB", "UserC", 
"UserD"))

您可以按行(apply)使用apply(,1)来计算最长的连续系列“是”:

result <- apply(d,1,function(s) {z<-rle(s); max(z$lengths[z$values=='Yes'])})
#UserA UserB UserC UserD 
#    4     1     2     1 

这里的关键功能是rle,它可以找到所有连续的系列。我们只选择与“是”(z$lengths[z$values=='Yes')对应的那些并返回最大值。最后一步是设置将1转换为零:

result[result==1] <- 0

#UserA UserB UserC UserD 
#    4     0     2     0 

答案 1 :(得分:3)

以下是使用applyrle的类似方法(我会发布这个因为已经在发布中间)

apply(df, 1, function(x) {
                          temp <- rle((x == "Yes"))  
                          temp2 <- with(temp, lengths[values])
                          temp2[temp2 > 1]
                          }
      )
# $UserA
# 
# 4 
# 
# $UserB
# named integer(0)
# 
# $UserC
# 
# 2 
# 
# $UserD
# named integer(0)