这个问题的答案可能很简单,但我似乎无法绕过它。
我有一个数据集:年,治疗,治疗水平和价值(产量)。处理包括矿物质(肥料),粪肥和堆肥。我想添加一个带参考值的列。该参考应该是给定年份的价值(产量)和矿物处理的水平。例如:
DF1<-data.frame(treatment = c("mineral","mineral", "manure","manure","compost","compost","mineral","mineral", "manure","manure", "compost","compost"),
year = c("1990","1990","1990","1990","1990","1990", "1991","1991","1991", "1991","1991","1991"),
level = c("1","2","1","2","1","2","1","2","1","2","1","2"),
value = c("1","2","1.1","2.2","1.3","2.5","3","4","3.2","4.4","3.5","4.8"))
DF1
treatment year level value
mineral 1990 1 1
mineral 1990 2 2
manure 1990 1 1.1
manure 1990 2 2.2
compost 1990 1 1.3
compost 1990 2 2.5
mineral 1991 1 3
mineral 1991 2 4
manure 1991 1 3.2
manure 1991 2 4.4
compost 1991 1 3.5
compost 1991 2 4.8
矿物质应该是指示物。因此,我想添加一个名为ref的列,它将在1990年为所有处理(粪肥,堆肥和矿物质)提供1级(如果级别1)和值2(如果级别2)。对于1991年,参考值应该是所有治疗3如果1级,4级如果2级。
任何人都可以就此提出建议:我将非常感激
答案 0 :(得分:1)
你可以尝试
res <- do.call(rbind,
lapply(split(DF1, list(DF1$year, DF1$level), drop=TRUE),
function(x){x$ref <- x$value[x$treatment=='mineral']
x}))
indx <- as.numeric(gsub(".*\\.", "", row.names(res)))
res1 <- res[order(indx),]
row.names(res1) <- NULL
res1
或使用data.table
library(data.table)
DT <- as.data.table(DF1)
DT1 <- DT[treatment=='mineral', list(ref=value), by=list(year, level)]
DT[,indx:=1:.N]
setkey(DT, year, level)
DT[J(DT1)][order(indx),][,indx:=NULL][]
# treatment year level value ref
#1: mineral 1990 1 1 1
#2: mineral 1990 2 2 2
#3: manure 1990 1 1.1 1
#4: manure 1990 2 2.2 2
#5: compost 1990 1 1.3 1
#6: compost 1990 2 2.5 2
#7: mineral 1991 1 3 3
#8: mineral 1991 2 4 4
#9: manure 1991 1 3.2 3
#10: manure 1991 2 4.4 4
#11: compost 1991 1 3.5 3
#12: compost 1991 2 4.8 4