使用DPLYR比较2列中的值

时间:2019-03-06 07:50:46

标签: r dplyr

我正在尝试使用dplyrmutate比较2列之间的每一行。

数据框

df <- data.frame(ID = c("1234", "1234", "7491", "7319", "321", "321"), 
add = c("1234", "1234", "749s1", "73a19", "321", "321"))

忽略,如果列ID =添加列,则返回1,否则返回0

df %>% mutate(TEST = ifelse(df$ID == df$add, 1, 0))

但是,上面的代码似乎无效。

更新:由于因子水平导致的错误

1 个答案:

答案 0 :(得分:1)

您尚未共享该错误,我认为是由于factor级。这是更新的解决方案。



library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

df <- data.frame(ID = c("1234", "1234", "7491", "7319", "321", "321"), 
                 add = c("1234", "1234", "749s1", "73a19", "321", "321"))



df %>% mutate(TEST = ifelse(as.character(ID) == as.character(add),1,0))
#>     ID   add TEST
#> 1 1234  1234    1
#> 2 1234  1234    1
#> 3 7491 749s1    0
#> 4 7319 73a19    0
#> 5  321   321    1
#> 6  321   321    1

reprex package(v0.2.1)于2019-03-06创建

您可以使用as.numeric进一步简化它:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

df <- data.frame(ID = c("1234", "1234", "7491", "7319", "321", "321"), 
                 add = c("1234", "1234", "749s1", "73a19", "321", "321"),
                 stringsAsFactors = FALSE)



df %>% mutate(TEST = as.numeric(ID == add))
#>     ID   add TEST
#> 1 1234  1234    1
#> 2 1234  1234    1
#> 3 7491 749s1    0
#> 4 7319 73a19    0
#> 5  321   321    1
#> 6  321   321    1

reprex package(v0.2.1)于2019-03-06创建