逻辑运算符和字符串:函数错误

时间:2019-10-22 14:44:25

标签: r function logical-operators data-cleaning

下面是一个最小的可重现示例,该示例会产生错误:

 comb3 <- function(x) {
      if (x == "Unable to do") {
        x = 0
      } 
    } 

这是我的原始功能:

 comb <- function(x) {
      if (x == "Unable to do") {
        x = 0
      } else if (x == "Very difficult to do") {
        x = 1
      } else if (x == "Somewhat difficult to do") {
        x = 2
      } else if (x == "Not difficult") {
        x = 3
      } 
    }

我正在尝试在下面采样的列上使用此功能。我收到此错误:

Warning messages:
1: In if (x == "Unable to do") { :
  the condition has length > 1 and only the first element will be used
2: In if (x == "Very difficult to do") { :
  the condition has length > 1 and only the first element will be used

Here is a sample of what the data in one column looks like:
sample <- c("Unable to do", "Somewhat difficult to do", "Very difficult to do", "Unable to do", "Not difficult","Unable to do","Very difficult to do", "Not difficult", "Unable to do", "Not difficult")        

1 个答案:

答案 0 :(得分:0)

警告消息很好地描述了您的代码问题。 if是一个期望长度为1的逻辑向量作为输入的函数。因此,要对向量使用条件,应改用ifelse或MrFlick所说的那样使用case_whenmutate_at

使用ifelse的功能的等效版本将如下所示:

comb1 <- function(x) {
  ifelse(x == "Unable to do", 
    0,
    ifelse (x == "Very difficult to do",
      1,
      ifelse(x == "Somewhat difficult to do",
        2,
        ifelse(x == "Not difficult",
          3,
          ## If not match then NA
          NA
        )
      )
    )
  )
}

请注意,由于ifelse调用已链接在一起,因此很难阅读。 因此,您可以通过在调用sapply时使用功能略有修改的版本来完成同一操作,从而避免这种情况

comb2 <- function(x) {
  sapply(x, function(y) {
    if (y == "Unable to do") {
      0
    } else if (y == "Very difficult to do") {
      1
    } else if (y == "Somewhat difficult to do") {
      2
    } else if (y == "Not difficult") {
       3
    }
  ## USE.NAMES = FALSE means that the output is not named, and has no other effect
  }, USE.NAMES = FALSE)
}

您还可以使用因子,这些因子在内部编码为从1开始的整数,并且(ab)使用此因子将字符串转换为数字:

comb3 <- function(x) {
  fac <- factor(x, 
    levels = c(
      "Unable to do",
      "Very difficult to do",
      "Somewhat difficult to do",
      "Not difficult"
    )
  )
  as.numeric(fac) - 1
}

这3个版本的输出是相同的,并且是一个很好的示例,说明了如何用多种方法来完成R中的工作。有时这可能是一种诅咒而不是礼物。

sample <- c("Unable to do", "Somewhat difficult to do", "Very difficult to do", "Unable to do", "Not difficult","Unable to do","Very difficult to do", "Not difficult", "Unable to do", "Not difficult")
comb1(sample)
# [1] 0 2 1 0 3 0 1 3 0 3
comb2(sample)
# [1] 0 2 1 0 3 0 1 3 0 3
comb3(sample)
# [1] 0 2 1 0 3 0 1 3 0 3