是否有R函数选择2个数据帧的公共值?

时间:2019-06-12 11:32:23

标签: r dataframe select merge match

我正在尝试选择两个数据帧的通用值。我有一个big_df和一个small_df

我要获取的是一个数据帧,其中两个数据帧中只有“ ID”值是公共的,我只想保留big_df而不是small_df。

library(dplyr)
df3 <- merge(big_df, small_df, by =("ID"))

> df3
  ID Age Name Colour
1  1  21    a   blue
2  4  20    d  green
3  8  87    h    red
4  9   9    i  black
big_df <- data.frame("ID" = 1:10, "Age" = c(21,15,1,20,34,45,67,87,9,77), "Name" = c("a","b","c","d","e","f","g","h","i","l"))


> big_df
   ID Age Name
1   1  21    a
2   2  15    b
3   3   1    c
4   4  20    d
5   5  34    e
6   6  45    f
7   7  67    g
8   8  87    h
9   9   9    i
10 10  77    l

small_df <- data.frame("ID" = c(1,4,8,9), "Colour" = c("blue","green","red","black"))


> small_df
  ID Colour
1  1   blue
2  4  green
3  8    red
4  9  black

我想要的是,没有颜色信息

> df3
  ID Age Name 
1  1  21    a   
2  4  20    d  
3  8  87    h   
4  9   9    i  

2 个答案:

答案 0 :(得分:2)

dplyr的{​​{1}}专为此目的

semi_join()

答案 1 :(得分:1)

我觉得您真正需要的是:

#check which big IDs exist in small IDs and subset
big_df[big_df$ID %in% unique(small_df$ID), ]
# ID Age Name
#1  1  21    a
#4  4  20    d
#8  8  87    h
#9  9   9    i

因此,在这种情况下,我认为您不需要加入。