创建一个其值依赖于其他多个列的列

时间:2018-07-16 16:54:28

标签: r if-statement

我正在尝试在数据帧(“ data”)中创建新列(“ newcol”),其值将由数据帧中最多两个其他列(“ B_stance”和“ C_stance”)的内容确定”)。 B_stance中的值是“ L”,“ R”,“ U”或“ N”。在C_stance中,它们是“ L”或“ R”。

请原谅半逻辑语言,但是我需要R代码,它将对newcol的内容实现这一目的:

if (data$B_stance = "L" AND data$C_stance = "L") then (data$newcol = "N")
if (data$B_stance = "L" AND data$C_stance = "R") then (data$newcol = "Y")
if (data$B_stance = "R" AND data$C_stance = "R") then (data$newcol = "N")
if (data$B_stance = "R" AND data$C_stance = "L") then (data$newcol = "Y")
if (data$B_stance = "U") then (data$newcol = "N")
if (data$B_stance = "N") then (data$newcol = "N")

我尝试查看“ ifelse”是否可以/如何实现,但是找不到如何在确定新值时从多个列值中提取示例。

3 个答案:

答案 0 :(得分:0)

创建key/val数据集然后进行联接可能更容易

keydat <- data.frame(B_stance = c('L', 'L', 'R', 'R'),
                      C_stance = c('L', 'R', 'R', 'L'),
                       newcol = c('N', 'Y', 'N', 'Y'),
                stringsAsFactors = FALSE)
library(dplyr)
left_join(data, keydat) %>%
           mutate(newcol = replace(newcol, is.na(newcol), 'N'))

答案 1 :(得分:0)

在基数R中,ifelse函数在这些情况下最有用。 dplyr库包含更多 健壮的if_else函数和case_when函数。如果第一个为true,则ifelse返回第二个参数,并且 如果第一个参数为false,则返回第三个参数。

data <- read.table(text="
B_stance C_stance
L R
L L
U X
R L
R R
N X
X X
", header= TRUE)


data$newcol = ifelse(data$B_stance == "L" & data$C_stance == "L", "N",
                     ifelse(data$B_stance == "L" & data$C_stance == "R", "Y",
                            ifelse(data$B_stance == "R" & data$C_stance == "R", "N",
                                   ifelse(data$B_stance == "R" & data$C_stance == "L", "Y",
                                          ifelse(data$B_stance == "U", "N",
                                                 ifelse(data$B_stance == "N", "N",
                                                        NA))))))

data

# B_stance C_stance newcol
# 1        L        R      Y
# 2        L        L      N
# 3        U        X      N
# 4        R        L      Y
# 5        R        R      N
# 6        N        X      N
# 7        X        X   <NA>

答案 2 :(得分:0)

通过dplyr,您可以使用case_when。如果您有很多条件,它比嵌套的if_else干净一点。

df <- data.frame(
  B_stance = c('L', 'L', 'R', 'R'),
  C_stance = c('L', 'R', 'R', 'L'),
  stringsAsFactors = FALSE
)

df %>% mutate(
  newcol = case_when(
    B_stance == 'U'                   ~ 'N',
    B_stance == 'N'                   ~ 'N',
    B_stance == 'L' & C_stance == 'L' ~ 'N',
    B_stance == 'L' & C_stance == 'R' ~ 'Y',
    B_stance == 'R' & C_stance == 'L' ~ 'Y',
    B_stance == 'R' & C_stance == 'R' ~ 'N',
    TRUE                              ~ B_stance
  )
)

#   B_stance C_stance newcol
# 1        L        L      N
# 2        L        R      Y
# 3        R        R      N
# 4        R        L      Y

请注意,case_when中的条件是惰性的;第一条true语句被执行。最后的TRUE确保在没有声明为真的情况下进行回退。