根据其他列的条件添加列

时间:2015-11-20 14:35:16

标签: r

我有这个表,我想根据条件添加列“新列”。条件是当在第3列中,我们有“Peter”,然后在“New col”中我想得到col2的结果。对于相同的ID,此值将相同。在“New Col”中,你可以看到欲望的结果。

ID  Col2    Col3    New Col
1   B   Peter   B
1   A   Matt    B
2   B   Peter   B
2   B   Matt    B
2   A   Matt    B
3   C   Peter   C

这就是我所拥有的。

for (j in 2:(j-1)){
  if (df$Col3[j] == "Peter"){
    df$Newcol[j] = df$Col2[j]
  } else {
    df$Newcol[j] = df$Newcol[j-1]  
  }
}

但是我得到数字“6”和“9”,而不是获得严格的值。

有什么建议吗? 非常感谢

1 个答案:

答案 0 :(得分:2)

使用data.table

library(data.table)
setDT(data)
data[,new:= if(any(Col3 == "Peter")) Col2[which(Col3 == "Peter")] else NA, by = ID]

#   ID Col2  Col3 new
#1:  1    B Peter   B
#2:  1    A  Matt   B
#3:  2    B Peter   B
#4:  2    B  Matt   B
#5:  2    A  Matt   B
#6:  3    C Peter   C

使用基础R lapply

do.call(rbind, 
        lapply(split(data, data$ID), 
        function(x){ if(any(x$Col3 == "Peter")){ 
        x$new = x$Col2[which(x$Col3 == "Peter")]; 
        x}}))

#    ID Col2  Col3 new
#1.1  1    B Peter   B
#1.2  1    A  Matt   B
#2.3  2    B Peter   B
#2.4  2    B  Matt   B
#2.5  2    A  Matt   B
#3    3    C Peter   C

数据

data = structure(list(ID = c(1L, 1L, 2L, 2L, 2L, 3L), Col2 = structure(c(2L, 
1L, 2L, 2L, 1L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
    Col3 = structure(c(2L, 1L, 2L, 1L, 1L, 2L), .Label = c("Matt", 
    "Peter"), class = "factor")), .Names = c("ID", "Col2", "Col3"
), class = "data.frame", row.names = c(NA, -6L))