比较两个数据框中的值并返回差异

时间:2019-10-10 03:12:56

标签: r dataframe diff

我想比较两个数据帧,以便按“组”查找增量值。在某些情况下,第一个数据帧可能具有一个“组”,而另一个则没有。在这种情况下,结果应反映显示的值。

df1 <- data.frame(Group = c("A","B","C","D","E","F","G","H"),
              Month.1 = c(10,15,30,24,16,33,27,19),
              Month.2 = c(20,37,12,31,26,22,31,20))

df2 <- data.frame(Group = c("A","B","C","D","E","F","G"),
              Month.1 = c(12,25,34,24,21,30,22),
              Month.2 = c(28,40,36,32,26,17,25))

我不太确定如何解决此问题。我探索了使用setdiff的方法,但是它仅返回原始值,而不返回差值。

结果应如下:

result <- data.frame(Group = c("A","B","C","D","E","F","G","H"),
                 Month.1 = c(2,10,4,0,5,-3,-5,19),
                 Month.2 = c(8,3,24,1,0,-5,-6,20))

1 个答案:

答案 0 :(得分:2)

我们可以对full_joindf1df2 group_by进行Group并取值之间的差。 (感谢@Onyambu提出了这种方法)

library(dplyr)

full_join(df1, df2) %>%
   group_by(Group) %>%
   summarise_all(~if(n() > 1) diff(.) else .)

#  Group Month.1 Month.2
#  <chr>   <dbl>   <dbl>
#1 A           2       8
#2 B          10       3
#3 C           4      24
#4 D           0       1
#5 E           5       0
#6 F          -3      -5
#7 G          -5      -6
#8 H          19      20

在基数R中为

aggregate(.~Group, merge(df1, df2, all = TRUE), function(x) 
          if(length(x) > 1) diff(x) else x)