Question

我有2个文件，分别是3列和几行。

1    2    10
2    3    20
3    4    30
4    5    40
5    1    50
6    1    60

和

1    8    10
2    3    100
3    4    45
4    5    78
5    2    99
6    80   60

现在我想要创建第三个文件，其中包含前两个文件的所有值，如果两个文件的第一列和第二列相同，那么在第三个文件中，与它们对应的值应该是say，value in third column in第一个文件必须在新创建的文件的第三列中，第二个文件的第三列中的值必须在新创建的文件的第四列中。根据上面的例子，答案应该是

1  2  10  0
2  3  20  100
3  4  30  45
4  5  40  78
1  8  10   0
5  1  50   0
6  1  60   0
5  2  99   0
6  80 60   0

Answer 1

如果您使用dput()发布示例会更容易。我会检查?merge是否有帮助或rbind.fill（包plyr）。希望这可以帮助赫尔曼

Answer 2

 res <- merge(dat1,dat2, by=c("V1", "V2"),all=TRUE)
 indx <- is.na(res[,3]) 
 res[indx,3] <- res[indx,4]
 res[indx,4] <- NA
 res[is.na(res)] <- 0
 #    V1 V2 V3.x V3.y
 #1  1  2   10    0
 #2  1  8   10    0
 #3  2  3   20  100
 #4  3  4   30   45
 #5  4  5   40   78
 #6  5  1   50    0
 #7  5  2   99    0
 #8  6  1   60    0
 #9  6 80   60    0

数据

 dat1 <- structure(list(V1 = structure(1:6, .Label = c("1", "2", "3", 
 "4", "5", "6"), class = "factor"), V2 = structure(c(2L, 3L, 4L, 
 5L, 1L, 1L), .Label = c("1", "2", "3", "4", "5"), class = "factor"), 
V3 = structure(1:6, .Label = c("10", "20", "30", "40", "50", 
"60"), class = "factor")), .Names = c("V1", "V2", "V3"), class = "data.frame", row.names = c(NA, 
-6L))

 dat2 <- structure(list(V1 = structure(1:6, .Label = c("1", "2", "3", 
 "4", "5", "6"), class = "factor"), V2 = structure(c(5L, 2L, 3L, 
 4L, 1L, 6L), .Label = c("2", "3", "4", "5", "8", "80"), class = "factor"), 
V3 = structure(c(1L, 2L, 3L, 5L, 6L, 4L), .Label = c("10", 
"100", "45", "60", "78", "99"), class = "factor")), .Names = c("V1", 
"V2", "V3"), class = "data.frame", row.names = c(NA, -6L))

在尝试上述代码之前，将数据列转换为numeric类

dat1[] <- lapply(dat1, function(x) as.numeric(as.character(x))) 
dat2[] <- lapply(dat2, function(x) as.numeric(as.character(x)))

将两个文件合并到一个新文件中

2 个答案:

数据