Question

这个问题的答案可能很简单，但我似乎无法绕过它。

我有一个数据集：年，治疗，治疗水平和价值（产量）。处理包括矿物质（肥料），粪肥和堆肥。我想添加一个带参考值的列。该参考应该是给定年份的价值（产量）和矿物处理的水平。例如：

DF1<-data.frame(treatment = c("mineral","mineral", "manure","manure","compost","compost","mineral","mineral", "manure","manure", "compost","compost"),
            year = c("1990","1990","1990","1990","1990","1990", "1991","1991","1991", "1991","1991","1991"),
            level = c("1","2","1","2","1","2","1","2","1","2","1","2"),
            value = c("1","2","1.1","2.2","1.3","2.5","3","4","3.2","4.4","3.5","4.8"))

DF1
 treatment year level value
mineral 1990     1     1
mineral 1990     2     2
 manure 1990     1   1.1
 manure 1990     2   2.2
compost 1990     1   1.3
compost 1990     2   2.5
mineral 1991     1     3
mineral 1991     2     4
 manure 1991     1   3.2
 manure 1991     2   4.4
compost 1991     1   3.5
compost 1991     2   4.8

矿物质应该是指示物。因此，我想添加一个名为ref的列，它将在1990年为所有处理（粪肥，堆肥和矿物质）提供1级（如果级别1）和值2（如果级别2）。对于1991年，参考值应该是所有治疗3如果1级，4级如果2级。

任何人都可以就此提出建议：我将非常感激

Answer 1

你可以尝试

 res <- do.call(rbind,
         lapply(split(DF1, list(DF1$year, DF1$level), drop=TRUE),
                function(x){x$ref <- x$value[x$treatment=='mineral']
                  x}))
 indx <- as.numeric(gsub(".*\\.", "", row.names(res)))
 res1 <- res[order(indx),]
 row.names(res1) <- NULL
 res1

或使用data.table

 library(data.table)
 DT <- as.data.table(DF1)
 DT1 <- DT[treatment=='mineral', list(ref=value), by=list(year, level)]
 DT[,indx:=1:.N]
 setkey(DT, year, level)
 DT[J(DT1)][order(indx),][,indx:=NULL][]
 #    treatment year level value ref
 #1:   mineral 1990     1     1   1
 #2:   mineral 1990     2     2   2
 #3:    manure 1990     1   1.1   1
 #4:    manure 1990     2   2.2   2
 #5:   compost 1990     1   1.3   1
 #6:   compost 1990     2   2.5   2
 #7:   mineral 1991     1     3   3
 #8:   mineral 1991     2     4   4
 #9:    manure 1991     1   3.2   3
#10:    manure 1991     2   4.4   4
#11:   compost 1991     1   3.5   3
#12:   compost 1991     2   4.8   4

R如何添加一个数据框列，其值是从不同行中的其他列值派生的？

1 个答案: