我正在尝试使用dplyr
和mutate
比较2列之间的每一行。
数据框
df <- data.frame(ID = c("1234", "1234", "7491", "7319", "321", "321"),
add = c("1234", "1234", "749s1", "73a19", "321", "321"))
忽略,如果列ID =添加列,则返回1,否则返回0
df %>% mutate(TEST = ifelse(df$ID == df$add, 1, 0))
但是,上面的代码似乎无效。
更新:由于因子水平导致的错误
答案 0 :(得分:1)
您尚未共享该错误,我认为是由于factor
级。这是更新的解决方案。
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df <- data.frame(ID = c("1234", "1234", "7491", "7319", "321", "321"),
add = c("1234", "1234", "749s1", "73a19", "321", "321"))
df %>% mutate(TEST = ifelse(as.character(ID) == as.character(add),1,0))
#> ID add TEST
#> 1 1234 1234 1
#> 2 1234 1234 1
#> 3 7491 749s1 0
#> 4 7319 73a19 0
#> 5 321 321 1
#> 6 321 321 1
由reprex package(v0.2.1)于2019-03-06创建
您可以使用as.numeric
进一步简化它:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df <- data.frame(ID = c("1234", "1234", "7491", "7319", "321", "321"),
add = c("1234", "1234", "749s1", "73a19", "321", "321"),
stringsAsFactors = FALSE)
df %>% mutate(TEST = as.numeric(ID == add))
#> ID add TEST
#> 1 1234 1234 1
#> 2 1234 1234 1
#> 3 7491 749s1 0
#> 4 7319 73a19 0
#> 5 321 321 1
#> 6 321 321 1
由reprex package(v0.2.1)于2019-03-06创建