我有这个表,我想根据条件添加列“新列”。条件是当在第3列中,我们有“Peter”,然后在“New col”中我想得到col2的结果。对于相同的ID,此值将相同。在“New Col”中,你可以看到欲望的结果。
ID Col2 Col3 New Col
1 B Peter B
1 A Matt B
2 B Peter B
2 B Matt B
2 A Matt B
3 C Peter C
这就是我所拥有的。
for (j in 2:(j-1)){
if (df$Col3[j] == "Peter"){
df$Newcol[j] = df$Col2[j]
} else {
df$Newcol[j] = df$Newcol[j-1]
}
}
但是我得到数字“6”和“9”,而不是获得严格的值。
有什么建议吗? 非常感谢
答案 0 :(得分:2)
使用data.table
library(data.table)
setDT(data)
data[,new:= if(any(Col3 == "Peter")) Col2[which(Col3 == "Peter")] else NA, by = ID]
# ID Col2 Col3 new
#1: 1 B Peter B
#2: 1 A Matt B
#3: 2 B Peter B
#4: 2 B Matt B
#5: 2 A Matt B
#6: 3 C Peter C
使用基础R lapply
do.call(rbind,
lapply(split(data, data$ID),
function(x){ if(any(x$Col3 == "Peter")){
x$new = x$Col2[which(x$Col3 == "Peter")];
x}}))
# ID Col2 Col3 new
#1.1 1 B Peter B
#1.2 1 A Matt B
#2.3 2 B Peter B
#2.4 2 B Matt B
#2.5 2 A Matt B
#3 3 C Peter C
数据强>
data = structure(list(ID = c(1L, 1L, 2L, 2L, 2L, 3L), Col2 = structure(c(2L,
1L, 2L, 2L, 1L, 3L), .Label = c("A", "B", "C"), class = "factor"),
Col3 = structure(c(2L, 1L, 2L, 1L, 1L, 2L), .Label = c("Matt",
"Peter"), class = "factor")), .Names = c("ID", "Col2", "Col3"
), class = "data.frame", row.names = c(NA, -6L))