R:使用RLE跨列的连续相同值的计数

时间:2017-01-31 01:11:08

标签: r dataframe

我正在使用R.我有一个数据框,其中包含每个玩家的行,然后是代表每个月的列和他们获得的点数(下面是随机值的说明性数据)。我想添加一个新列(Points $ ConsecutiveShutouts),其中包含过去5个月内指定点总数的最长连续条纹。

Points <- data.frame("Player" = c("Alpha", "Beta", "Charlie", "Delta", "Echo", "Foxtrot", "Gamma"), "MayPts" = c(floor(runif(7, 0, 3))), "JunPts" = c(floor(runif(7, 0, 3))), "JulPts" = c(floor(runif(7, 0, 3))), "AugPts" = c(floor(runif(7, 0, 3))), "SepPts" = c(floor(runif(7, 0, 3))), "OctPts" = c(floor(runif(7, 0, 3))), "NovPts" = c(floor(runif(7, 0, 3))),"DecPts" = c(floor(runif(7, 0, 3))))

Player MayPts JunPts JulPts AugPts SepPts OctPts NovPts DecPts
Alpha      0      0      1      0      2      2      2      0
Beta       1      0      1      1      1      1      1      2
Charlie    1      2      2      0      2      1      1      0
Delta      0      1      1      2      2      2      0      0
Echo       1      1      0      2      1      2      0      1
Foxtrot    1      0      0      0      0      0      2      1
Gamma      2      0      1      1      0      2      0      1

我尝试过使用rle(积分):

# Establish the start and end months
StartMonth <- which(colnames(Points) == "SepPts")
EndMonth <- which(colnames(Points) == "DecPts")

# Find total of consecutive months with 0 points
Points$ConsecutiveShutOuts <- max(rel(Points[ ,StartMonth:EndMonth] == 0), lengths[!values])

这样做,我最终得到了错误&#34;&#39; X&#39;必须是原子类型的矢量&#34;

关于我做错了什么以及如何解决问题的任何建议?或者替代方法?

提前致谢! [初学者在这里,所以希望我按照正确的方法提问:)]

2 个答案:

答案 0 :(得分:1)

我也会使用长形式。我首先会创建一个这样的函数。

myfun <- function(series,value){
    tmp <- rle(series); runs <- tmp$lengths[tmp$values == value]
    if (length(runs)==0) return(0)
    else return(max(runs))
}

使用tidyr / dplyr,您可以继续

library(dplyr)
library(tidyr)
Points %>%
  gather(months,Pts,MayPts:DecPts) %>%
  group_by(Player) %>%
  summarise(x=myfun(tail(Pts,5),0))
# Past 5 month, number of consecutive zeros for each player.

当然,如果您愿意,可以将结果加入到原始的宽格式数据框中。

答案 1 :(得分:0)

如果您想根据某些条件求和(例如,只有高于1的求和点),您可以将求和值仅限于大于该值的行。

Points <- as.data.table(Points)
Points <- melt(Points, id="Player", variable.name = "Month", value.name = "PTs") 
Points <- Points[PTs>1, list(PTs = sum(PTs, na.rm=TRUE)), by="Player"] #change ">1" if you prefer a different value