如何连接表并生成列的总和?

时间:2015-06-30 11:58:19

标签: r

我有几个具有相同结构的表(特别是两个例子)。我想加入ID_Position& ID_Name并在输出表中生成1月和2月的总和(两列中可能有一些NA)

ID_Position<-c(1,2,3,4,5,6,7,8,9,10)
Position<-c("A","B","C","D","E","H","I","J","X","W")
ID_Name<-c(11,12,13,14,15,16,17,18,19,20)
Name<-c("Michael","Tobi","Chris","Hans","Likas","Martin","Seba","Li","Sha","Susi")
  jan<-c(10,20,30,22,23,2,22,24,26,28)
  feb<-c(10,30,20,12,NA,3,NA,22,24,26)

df1 <- data.frame(ID_Position,Position,ID_Name,Name,jan,feb)


ID_Position<-c(1,2,3,4,5,6,7,8,9,10)
Position<-c("A","B","C","D","E","H","I","J","X","W")
ID_Name<-c(11,12,13,14,15,16,17,18,19,20)
 Name<-c("Michael","Tobi","Chris","Hans","Likas","Martin","Seba","Li","Sha","Susi")
  jan<-c(10,20,30,22,NA,NA,22,24,26,28)
  feb<-c(10,30,20,12,23,3,3,22,24,26)

  df2 <- data.frame(ID_Position,Position,ID_Name,Name,jan,feb)

我尝试了内部和完整的连接。但这似乎符合我的要求:

   library(plyr)

    test<-join(df1, df2, by =c("ID_Position","ID_Name") , type = "inner", match = "all")

期望的输出:

  ID_Position   Position    ID_Name       Name         jan  feb
      1            A          11          Michael        20 20
      2            B          12          Tobi           40 60
      3            C          13          Chris          60 40
      4            D          14          Hans           44 24
      5            E          15          Likas          23 23
      6            H          16          Martin         2  6
      7            I          17          Seba           44 22
      8            J          18          Li             48 44
      9            X          19          Sha            52 48
     10            W          20          Susi           56 52

1 个答案:

答案 0 :(得分:2)

您想要的输出似乎并不完全正确,但这里有一个示例说明如何使用data.table二进制连接有效地执行此操作,这允许您在使用加入时有效地运行函数by = .EACHI选项

library(data.table)
setkey(setDT(df1), ID_Position, ID_Name, Name) 
setkey(setDT(df2), ID_Position, ID_Name, Name)
df2[df1, .(jan = sum(jan, i.jan, na.rm = TRUE), 
           feb = sum(feb, i.feb, na.rm = TRUE)), 
    by = .EACHI]
#     ID_Position ID_Name    Name jan feb
#  1:           1      11 Michael  20  20
#  2:           2      12    Tobi  40  60
#  3:           3      13   Chris  60  40
#  4:           4      14    Hans  44  24
#  5:           5      15   Likas  46   0
#  6:           6      16  Martin   0   6
#  7:           7      17    Seba  44   0
#  8:           8      18      Li  48  44
#  9:           9      19     Sha  52  48
# 10:          10      20    Susi  56  52