如何在dplyr链中将数据框连接到自身?

时间:2016-09-14 16:29:37

标签: r dplyr magrittr

偶尔,我需要在dplyr链中将数据框加入到(通常是已修改的)自身版本中。像这样:

df  <- data.frame(
     id = c(1,2,3)
   , status = c('foo','bar','meh')
   , spouseid = c(4,3,2)
)


df %>% 
  filter( status == 'foo' | status == 'bar') %>% 
  # join the filtered table to itself using the dot as the right-hand side
  left_join(., by = c('id' = 'spouseid'))

当我尝试时,我得到Error in is.data.frame(y) : argument "y" is missing, with no default

1 个答案:

答案 0 :(得分:6)

问题是使用点只是左手边移动,所以上面写的方式只将lhs传递给left_join()。要在左侧和右侧使用点,请使用点两次:

df %>% 
  filter( status == 'foo' | status == 'bar') %>% 
  # the first dot is x argument and the second dot is the y argument
  left_join(
      x = . 
    , y = . 
    , by = c('id' = 'spouseid')
  )

这样,你将lhs传递给left_join()的两个参数,而不是像往常那样依赖于magrittr的隐式lhs。