如何编写if语句,在数据框中创建新列的多个条件

时间:2016-06-22 14:25:44

标签: r

我有以下脚本,用于比较数据框的2列中的值,并根据结果返回/创建新列(多个条件)。

循环运行但不返回带有结果的新列。我会在这里调用ifelse()函数来解释向量化但不确定如何将多个条件传递给它(本质上看似二进制)

for( i in nrow(LeaseDF_Region)){
  if(LeaseDF_Region$HLD_Criteria_1 == "N" && LeaseDF_Region$HLD_Criteria_2 == "N"){
LeaseDF_Region$HLD_Criteria_3 == "N"
 }else if (LeaseDF_Region$HLD_Criteria_1 == "Y" && LeaseDF_Region$HLD_Criteria_2 == "Y"){
  LeaseDF_Region$HLD_Criteria_3 == "Y"
 }else if (LeaseDF_Region$HLD_Criteria_1 == "Y" && LeaseDF_Region$HLD_Criteria_2 == "N"){
  LeaseDF_Region$HLD_Criteria_3 == "Y"
}else if(LeaseDF_Region$HLD_Criteria_1 == "N" && LeaseDF_Region$HLD_Criteria_2 == "Y"){
   LeaseDF_Region$HLD_Criteria_3 == "Y"
}
}

数据只是为Col 1和2随机化的值N和Y所以我希望执行以下操作(对于DF中的每一行):

  • IF col 1是n,col 2是n然后在new col 3中返回n
  • IF col 1是n,col 2是y然后在新col 3中返回y
  • IF col 1是y,col 2是n然后在新col 3中返回y
  • IF col 1是y,col 2是y然后在new col 3中返回y

注意n =否且y =是

4 个答案:

答案 0 :(得分:4)

您可以简单地使用数据表子集... 首先初始化一列,然后根据您的条件为其分配值。所以这里DF是我的数据帧,而TEMP是我用新的comlumn“control temp”分类的参数。

DF$Control_Temp <- NA
DF$Control_Temp[DF$TEMP <= 50 & DF$TEMP2 == -1] <- 'Y'
DF$Control_Temp[DF$TEMP > 50 & DF$TEMP <= 100 & DF$TEMP2 == -1] <- 'N'
DF$Control_Temp[DF$TEMP > 100 & DF$TEMP2 == -1 ] <- 'Y'

答案 1 :(得分:1)

试试这个(使用dplyr包):

LeaseDF_Region %>% mutate(HLD_Criteria_3 = 
                            ifelse(LeaseDF_Region$HLD_Criteria_1 == "N" &
                                   LeaseDF_Region$HLD_Criteria_2 == "N", "N", 
                                   ifelse(LeaseDF_Region$HLD_Criteria_1 == "Y" & 
                                          LeaseDF_Region$HLD_Criteria_2 == "Y", "Y", 
                                          ifelse(...))))

答案 2 :(得分:1)

与Jacob Odom的帖子类似,我喜欢下标。我认为将所有内容设置为“Y”然后绘制出“N”s:

会更清晰一些
LeaseDF_Region$HLD_Criteria_3 <- "Y" # Set all values to "Y"
index_n <- `&`(
    # Map out the "N" indexes with a boolean vector
    LeaseDF_Region$HLD_Criteria_1 == "N",
    LeaseDF_Region$HLD_Criteria_2 == "N"
)
LeaseDF_Region$HLD_Criteria_3[index_n] <- "N" # Assign "N" accordingly

答案 3 :(得分:1)

只需使用data.table

library(data.table)
dt <- data.table(C1 = sample(c('Y','N'), 10, replace=T), C2 = sample(c('Y','N'), 10, replace=T))

dt[, C3 := ifelse(C1 == 'Y' | C2 == 'Y', 'Y', 'N')]

给你

    C1 C2 C3
 1:  Y  N  Y
 2:  N  N  N
 3:  Y  Y  Y
 4:  Y  N  Y
 5:  N  N  N
 6:  N  Y  Y
 7:  N  N  N
 8:  Y  Y  Y
 9:  N  N  N
10:  N  Y  Y