我正在使用R.我有一个数据框,其中包含每个玩家的行,然后是代表每个月的列和他们获得的点数(下面是随机值的说明性数据)。我想添加一个新列(Points $ ConsecutiveShutouts),其中包含过去5个月内指定点总数的最长连续条纹。
Points <- data.frame("Player" = c("Alpha", "Beta", "Charlie", "Delta", "Echo", "Foxtrot", "Gamma"), "MayPts" = c(floor(runif(7, 0, 3))), "JunPts" = c(floor(runif(7, 0, 3))), "JulPts" = c(floor(runif(7, 0, 3))), "AugPts" = c(floor(runif(7, 0, 3))), "SepPts" = c(floor(runif(7, 0, 3))), "OctPts" = c(floor(runif(7, 0, 3))), "NovPts" = c(floor(runif(7, 0, 3))),"DecPts" = c(floor(runif(7, 0, 3))))
Player MayPts JunPts JulPts AugPts SepPts OctPts NovPts DecPts
Alpha 0 0 1 0 2 2 2 0
Beta 1 0 1 1 1 1 1 2
Charlie 1 2 2 0 2 1 1 0
Delta 0 1 1 2 2 2 0 0
Echo 1 1 0 2 1 2 0 1
Foxtrot 1 0 0 0 0 0 2 1
Gamma 2 0 1 1 0 2 0 1
我尝试过使用rle(积分):
# Establish the start and end months
StartMonth <- which(colnames(Points) == "SepPts")
EndMonth <- which(colnames(Points) == "DecPts")
# Find total of consecutive months with 0 points
Points$ConsecutiveShutOuts <- max(rel(Points[ ,StartMonth:EndMonth] == 0), lengths[!values])
这样做,我最终得到了错误&#34;&#39; X&#39;必须是原子类型的矢量&#34;
关于我做错了什么以及如何解决问题的任何建议?或者替代方法?
提前致谢! [初学者在这里,所以希望我按照正确的方法提问:)]
答案 0 :(得分:1)
我也会使用长形式。我首先会创建一个这样的函数。
myfun <- function(series,value){
tmp <- rle(series); runs <- tmp$lengths[tmp$values == value]
if (length(runs)==0) return(0)
else return(max(runs))
}
使用tidyr / dplyr,您可以继续
library(dplyr)
library(tidyr)
Points %>%
gather(months,Pts,MayPts:DecPts) %>%
group_by(Player) %>%
summarise(x=myfun(tail(Pts,5),0))
# Past 5 month, number of consecutive zeros for each player.
当然,如果您愿意,可以将结果加入到原始的宽格式数据框中。
答案 1 :(得分:0)
如果您想根据某些条件求和(例如,只有高于1的求和点),您可以将求和值仅限于大于该值的行。
Points <- as.data.table(Points)
Points <- melt(Points, id="Player", variable.name = "Month", value.name = "PTs")
Points <- Points[PTs>1, list(PTs = sum(PTs, na.rm=TRUE)), by="Player"] #change ">1" if you prefer a different value