如何使用mapply将函数应用于两个不同数据框中的两个不同列

时间:2014-12-22 04:26:35

标签: r function mapply

我试图在两个不同的数据帧中找到两列之间的重复项。在识别重复项之后,我想从复制所在的同一行中提取观察,但是从不同的列中提取,并将其插入到另一个数据框中。让我举个例子:

Table1:
tobecopied   B   Checkfordup   D
Copy1        2   dupchk1       5
Copy2        3   dupchk5       4
Copy3        4   dupchk4       K

Table2:
tobepastedinto   B   Checkfordup   D
                 5   dupchk1       L
                 6   dupchk2       M
                 7   dupchk4       3

所以代码运行后,表2将如下所示:

Updated Table2:

tobepastedinto   B   Checkfordup   D
Copy1            5   dupchk1       L
                 6   dupchk2       M
Copy3            7   dupchk4       3

我试图做的是创建一个执行此操作的函数,并在两个列中使用mapply。这是代码的样子:

             checknum <- function(x,y){
               if(y=x){
                 gsub(x,y,Table2$tobepastedinto)
               }
               else{""}
             }
            mapply(checknum,Table2$Checkfordup,Table1$Checkfordup)

该功能永远在R中运行,我很确定我做错了。有没有人有更好的解决方案,我正在尝试做什么?或者有更好的方法来使用mapply吗?

编辑: 这是小数据集。 NASET中没有数字。我想查看Numberset中的任何手机是否与NASET中的手机匹配,然后将相应的号码添加到NASET,即使名称不匹配:

 NASET:
 name     Number     mobile
 VAN                 678
 GEORGE              6564
 STEVEN              76787



Numberset:
 name     Number     mobile
 TEU      7          678
 GEGE     6          64
 VEN      5          87
 TETK     7          678

Updated NASET:
NASET:
 name     Number     mobile
 VAN      7          678
 GEORGE              6564
 STEVEN              76787

1 个答案:

答案 0 :(得分:1)

你可以尝试

df2$tobepasteinto <- df1$tobecopied[match(df2$Checkfordup, df1$Checkfordup)]
df2$tobepasteinto[is.na(df2$tobepasteinto)] <- ''

或者

df2$tobepasteinto <-  mapply(function(x,y,z) {indx <- match(x,y)
                          ifelse(is.na(indx), '', z[indx])},
               df2$Checkfordup, list(df1$Checkfordup),list(df1$tobecopied))

更新

  NASET$Number <- Numberset$Number[match(NASET$mobile, Numberset$mobile)]
  NASET$Number[is.na(NASET$Number)] <- ''
  NASET
  #    name Number mobile
  #1    VAN      7    678
  #2 GEORGE          6564
  #3 STEVEN         76787

或者

  NASET$Number <- mapply(function(x,y,z) {
                     indx <- match(x,y)
                   ifelse(is.na(indx), '', z[indx])},
             NASET$mobile, list(Numberset$mobile), list(Numberset$Number))

或者

  library(dplyr)
  left_join(NASET[,-2], unique(Numberset[2:3]), by='mobile')
  #   mobile   name Number
  #1    678    VAN      7
  #2   6564 GEORGE     NA
  #3  76787 STEVEN     NA

数据

df1 <-  structure(list(tobecopied = c("Copy1", "Copy2", "Copy3"), B = 2:4, 
Checkfordup = c("dupchk1", "dupchk5", "dupchk4"), D = c("5", 
"4", "K")), .Names = c("tobecopied", "B", "Checkfordup", 
"D"), class = "data.frame", row.names = c(NA, -3L))

 df2 <-  structure(list(tobepastedinto = c("", "", "", ""), B = 5:8,
  Checkfordup = c("dupchk1", "dupchk2", "dupchk4", "dupchk4"), 
  D = c("L", "M", "3", "5")), .Names = c("tobepastedinto", 
 "B", "Checkfordup", "D"), row.names = c(NA, -4L), class = "data.frame")

newdata

  NASET <- structure(list(name = c("VAN", "GEORGE", "STEVEN"), Number = c(NA, 
  NA, NA), mobile = c(678L, 6564L, 76787L)), .Names = c("name", 
  "Number", "mobile"), class = "data.frame", row.names = c(NA, -3L))

 Numberset <- structure(list(name = c("TEU", "GEGE", "VEN", "TETK"),
 Number = c(7L, 6L, 5L, 7L), mobile = c(678L, 64L, 87L, 678L)), .Names =
  c("name", "Number", "mobile"), class = "data.frame", row.names = c(NA, 
 -4L))