在新列中,我想表示每次合并更新丢失的记录。
目的:我有一个缺少分类代码的数据集。为了替换缺少的值,我使用了多个left_join/coalesce
操作,用正确的代码替换了NA。我想跟踪每次迭代中更改了哪些值。
# DATA
df <- tibble(
x = c(1, 2, 3, NA, NA), #<Original data
y = c( 1, NA, 3, 4, NA) #<New data from join
)
# A tibble: 5 x 2
x y
<dbl> <dbl>
1 1 1
2 2 NA
3 3 3
4 NA 4
5 NA NA
我想看...
# A tibble: 5 x 2
x changed
<dbl> <chr>
1 1 no.change
2 2 no.change
3 3 no.change
4 4 corrected
5 NA no.change
答案 0 :(得分:1)
您可以使用case_when
library(tidyverse)
df %>%
mutate(new = coalesce(x, y)) %>%
mutate(changed = case_when(
x == new | is.na(new) ~ "no.change",
TRUE ~ "corrected")) %>%
select(new, changed) # %>% rename(x = new)
结果
# A tibble: 5 x 2
# new changed
# <dbl> <chr>
#1 1 no.change
#2 2 no.change
#3 3 no.change
#4 4 corrected
#5 NA no.change