我有一个由玩家,锦标赛和日期记录的每个高尔夫分数的csv文件,我想创建一个列,用于计算过去50分的BY玩家使用日期字段的平均值。它需要由玩家平均运行。
示例:表
PLAYER,SCORE,Tournament,ROUNDDATE,Observation,ID
Matthew Fitzpatrick,60,KLM Open,42258,1,1
Jaco Van Zyl,61,Turkish Airlines Open,42306,1,2
Paul Lawrie,61,KLM Open,42257,1,3
Wade Ormsby,61,KLM Open,42257,1,4
Callum Shinkwin,62,Shenzhen International,42483,1,5
Danny Willett,62,Omega European Masters,42209,1,6
Joakim Lagergren,62,Alfred Dunhill Links Championship,42280,1,7
我尝试过这段代码,但它只产生完全相同的结果而不是平均值
get.mav< - function(bp,n = 50){ +要求(动物园) + if(is.na(bp [1]))bp [1]< - mean(bp,na.rm = TRUE) + bp< - na.locf(bp,na.rm = FALSE) + if(长度(bp) +} test< - with(test,test [order(PLAYER,ROUNDDATE),])
测试$ SCORE_UPDATED< - + unlist(聚合(SCORE~PLAYER,test,get.mav,na.action = NULL,n = 50)$ SCORE) 测试
答案 0 :(得分:0)
您可以按日期安排,然后指出前50个。这是一个简单的例子:
# Sample Data
dat <- data.frame(date = seq(as.Date("2000/1/1"), by = "day", length.out = 365),
score = round(rnorm(365, 70, 5)))
# Arrange and get mean of first 50 obs
dat <- dat[order(dat$date, decreasing = TRUE),]
mean(dat$score[1:50])