如果匹配三个变量,则加入五个表

时间:2015-11-18 00:56:23

标签: r

费用

            Name  Class     Status   Cost
      Page, Lisa     11  Full Time  54550
      Page, Lisa     10   Contract  26795
  Taylor, Hector      7  Full Time  42540
Dawson, Jonathan     11  Full Time  35680
Dawson, Jonathan      6  Full Time  72830
Dawson, Jonathan      5   Contract  60830
     Pratt, Erik      8  Full Time  83000

主题

            Name  Class     Status  Subjects
      Page, Lisa     11  Full Time     Maths
      Page, Lisa     10   Contract   Science
  Taylor, Hector      7  Full Time   Science
Dawson, Jonathan     11  Full Time   English
Dawson, Jonathan      6  Full Time     Maths
Dawson, Jonathan      5   Contract     Maths
     Pratt, Erik      8  Full Time  Hinduism

ComputerNo

            Name  Class     Status  ComputerNo
      Page, Lisa     11  Full Time      115005
      Page, Lisa     10   Contract      450005
  Taylor, Hector      7  Full Time      380025
Dawson, Jonathan     11  Full Time      152253
Dawson, Jonathan      6  Full Time      125523
Dawson, Jonathan      5   Contract      485125

LicenseNo

            Name  Class     Status  LicenseNo
      Page, Lisa     11  Full Time   HJ452632
      Page, Lisa     10   Contract   HJ452634
  Taylor, Hector      7  Full Time   HJ352236
Dawson, Jonathan     11  Full Time   HJ456236
Dawson, Jonathan      6  Full Time   HJ456230
Dawson, Jonathan      5   Contract   HJ456232
     Pratt, Erik      8  Full Time   HJ130055

国家

            Name  Class     Status    Country
      Page, Lisa     11  Full Time  Hong Kong
      Page, Lisa     10   Contract  Hong Kong
  Taylor, Hector      7  Full Time         UK
Dawson, Jonathan     11  Full Time        USA
Dawson, Jonathan      6  Full Time        USA
Dawson, Jonathan      5   Contract        USA
     Pratt, Erik      8  Full Time      Japan

我期待的结果表是这样的 CombinedDataSet

            Name  Class     Status   Cost  Subjects  ComputerNo  LicenseNo    Country
      Page, Lisa     11  Full Time  54550     Maths      115005   HJ452632  Hong Kong
      Page, Lisa     10   Contract  26795   Science      450005   HJ452634  Hong Kong
  Taylor, Hector      7  Full Time  42540   Science      380025   HJ352236         UK
Dawson, Jonathan     11  Full Time  35680   English      152253   HJ456236        USA
Dawson, Jonathan      6  Full Time  72830     Maths      125523   HJ456230        USA
Dawson, Jonathan      5   Contract  60830     Maths      485125   HJ456232        USA
     Pratt, Erik      8  Full Time  83000  Hinduism        -NA-   HJ130055      Japan

如上所述,我有五个数据表,我想通过加入来创建一个数据集。

我希望在每个数据表中匹配3个变量(名称,类和状态)&然后加入。如果在特定表格中没有满足标准,那么我希望在决赛桌中看到这一点。 (作为空格或通过“-NA-”评论)。

2 个答案:

答案 0 :(得分:1)

使用基本R merge()函数并在by()多个加入列中列出,并指定all=TRUE以保留右表和左表中的记录:

finaldf <- merge(cost, subject, by=c("Name", "Class", "Status"), all=TRUE)
finaldf <- merge(finaldf, computerNo, by=c("Name", "Class", "Status"), all=TRUE)
finaldf <- merge(finaldf, licenseNo, by=c("Name", "Class", "Status"), all=TRUE)
finaldf <- merge(finaldf, country, by=c("Name", "Class", "Status"), all=TRUE)

答案 1 :(得分:1)

您可以使用Reduce

一次完成所有操作
Reduce(function(x, y) merge(x, y, all = TRUE, 
    by = c("Name", "Class", "Status")), list(cost, subject, computerNo, licenseNo, country))