过滤后使用左连接更新数据

时间:2017-07-17 04:25:25

标签: r dplyr left-join

我有如下数据:

city

当x1名称为NA时,我想用x2“值”填充x1“值”,假码如下:

x1 <- data.frame(names = c('a','b', NA, NA, 'd'),
             match = c('a1', 'a2', 'a10', 'a10', 'a4'), 
             value = rnorm(5))
x2 <- data.frame(match = c('a10','a11'), value = rnorm(2))

它会抛出错误信息:

x1 %>% 
  mutate( value = ifelse(is.na(names), left_join(x2, by = 'match'), value)).

我知道这是因为left_join问题,但我不知道如何正确编码。一般来说,如何在使用left_join(或valuemaps)过滤后更新数据

2 个答案:

答案 0 :(得分:3)

首先执行left_join可能更容易,然后根据value替换names中的数字。 x3是最终输出。

library(dplyr)

x1 <- data_frame(names = c('a','b', NA, NA, 'd'),
                 match = c('a1', 'a2', 'a10', 'a10', 'a4'), 
                 value = rnorm(5))
x2 <- data_frame(match = c('a10','a11'), value = rnorm(2))

x3 <- x1 %>%
  left_join(x2, by = "match") %>%
  mutate(value.x = ifelse(is.na(names), value.y, value.x)) %>%
  select(names, match, value = value.x)

答案 1 :(得分:2)

以下是使用data.table的选项,我们转换了&#39; data.frame&#39;到&#39; data.table&#39; (setDT(x1)),加入on&#39;匹配&#39;并分配&#39;值&#39; &#39; x2&#39;的列即“i.value&#39;价值&#39; &#39; x1&#39;

library(data.table)
setDT(x1)[x2, value := i.value, on = .(match)]
x1
#   names match      value
#1:     a    a1 -0.5458808
#2:     b    a2  0.5365853
#3:    NA   a10 -0.3432662
#4:    NA   a10 -0.3432662
#5:     d    a4  0.8474600

数据

set.seed(24)
x1 <- data.frame(names = c('a','b', NA, NA, 'd'),
         match = c('a1', 'a2', 'a10', 'a10', 'a4'), 
         value = rnorm(5))
set.seed(49)
x2 <- data.frame(match = c('a10','a11'), value = rnorm(2))