我创建了以下数据框
Group <- c('A','A','A','B','B','B','B','C','C','C')
YearWeek <-c('201401','201401','201401','201401','201401','201401','201401','201401','201401','201401')
Score1 <- c(404,440,395,500,450,476,350,500,600,575)
Group <- c('A','A','A','B','B','B','B','C','C','C','A','A','A','B','B','B','B','C','C','C')
YearWeek <-c('201401','201401','201401','201401','201401','201401','201401','201401','201401','201401','201402','201402','201402','201402','201402','201402','201402','201402','201402','201402')
Score1 <-c(404,440,395,500,450,476,350,500,600,575,460,445,400,508,470,422,368,555,700,634)
employee <- c(1:20)
employ.data <- data.frame(employee, Group, YearWeek, Score1)
我想按“YearWeek”的每个级别计算组'A'(我的对照组)的平均值,并根据相同的YearWeek从每个员工(包括对照组员工)的Score1中减去它,并添加结果将数据框作为新变量“差异”
我首先尝试获取组'A'(对照组员工)的平均值,但收到以下错误:
CTRLScore <- as.data.frame(employ.data[, j=list(mean(Score1),by = list(YearWeek,Group,"A"))])
Error in .subset(x, j) : invalid subscript type 'list'
另外:警告信息:
In `[.data.frame`(employ.data, , j = list(mean(Score1), by = list(YearWeek, :
named arguments other than 'drop' are discouraged
答案 0 :(得分:2)
这是我认为可行的策略。
首先计算每个YearWeek的A组平均值
ctrlmeans <- with(subset(employ.data, Group=="A"), tapply(Score1, YearWeek, mean))
返回命名向量。然后我们可以使用data.frame的YearWeek列来查看该表以减去平均值。我们可以用
做到这一点Difference <- employ.data$Score1-ctrlmeans[employ.data$YearWeek]
然后将其添加回data.frame
employ.data$Difference <- Difference
答案 1 :(得分:0)
这似乎对我有用:
library(reshape)
melted<-melt(employ.data)
casted<-cast(x,formula=Group+YearWeek~variable,subset=variable=="Score1",fun.aggregate=mean)
#Print Out
casted
# Holder variables
addColumn <- NULL
i<-0
for(i in 1:nrow(employ.data))
{
score <- employ.data[i,]$Score1
group<-employ.data[i,]$Group
yearWeek <- employ.data[i,]$YearWeek
sub<-casted[casted$Group %in% group,]
meanScore<-sub[sub$YearWeek %in% yearWeek,]$Score1
addColumn <- c(addColumn,score-meanScore)
}
# Combine
cbind(employ.data,addColumn)
答案 2 :(得分:0)
@ MrFlick答案的dplyr
变体:
# calculating the means
ctrlmeans <- with(subset(employ.data, Group=="A"), tapply(Score1, YearWeek, mean))
# adding the difference to the data.frame
require(dplyr)
employ.data <- employ.data %.%
mutate(Difference = Score1 - ctrlmeans[employ.data$YearWeek])