if_else在dplyr :: mutate中无法按预期工作

时间:2019-09-16 10:08:21

标签: r rstudio

您可以将以下代码复制到R脚本文件中并运行它:

preprocess_brand_version = function(dataset) {
  dataset$brand_version = gsub("^([0-9]+)(\\.[0-9]+)?.*$", "\\1\\2", dataset$brand_version, perl = TRUE)
  dataset = dataset %>% mutate(
    brand_version = ifelse(!(is.na(brand) || is.na(brand_version)), paste(substr(brand, 1, 3), ", ", brand_version, sep = ""), NA)
  )
  dataset$brand_version = as.factor(dataset$brand_version)
  return (dataset)
}

a = data.frame(brand = c("Samsung", "Motorola"), brand_version = c("1.4.3", "6.3"))
b = a
b[1,2] = NA
a
b
preprocess_brand_version(b)

我的问题是,当我运行它时,我得到:

> a
     brand brand_version
1  Samsung         1.4.3
2 Motorola           6.3

> b
     brand brand_version
1  Samsung          <NA>
2 Motorola           6.3

> preprocess_brand_version(b)
     brand brand_version
1  Samsung          <NA>
2 Motorola          <NA>

我原本希望得到:“ Mot,6.3”作为摩托罗拉行上版本的新值。

有人知道为什么if_else无法正常工作吗?

谢谢!

2 个答案:

答案 0 :(得分:1)

您正在使用双精度形式的“或” ||,这将迫使代码遍历模式中的每个元素。切换为缩写形式|应该可以解决此问题。

答案 1 :(得分:1)

仅将一个竖线用作或:

preprocess_brand_version = function(dataset) {
  dataset$brand_version = gsub("^([0-9]+)(\\.[0-9]+)?.*$", "\\1\\2", dataset$brand_version, perl = TRUE)
  dataset = dataset %>% mutate(
    brand_version = ifelse(!(is.na(brand) | is.na(brand_version)), paste(substr(brand, 1, 3), ", ", brand_version, sep = ""), NA)
  )
  dataset$brand_version = as.factor(dataset$brand_version)
  return (dataset)
}

如果需要,我在youtube上有一个关于正则表达式的简短教程: https://www.youtube.com/watch?v=YeMC1aNNu-4