合并数据框

时间:2014-06-04 11:42:23

标签: r merge

我有两个像这样的数据框

all

      V1   V2 V3 V4 V5    V6   V7   V8   V9  V10    V11  V12  V13 V14 V15 V16 V17 V18 V19 V20 V21                V22
1  00013 1966  8  1  0 -30.8 35.1 9999 9999   23   1027   22 20.5   0 361   3  98   2   0   5 254 370154111900000000
2   0003 1966  8  1  0   -48   35 9999 9999 25.2 1022.8 25.1 24.1 6.2 270   6  98   2   1   3 128 391764111900000000
3  00034 1966  8  1  0 116.5   16 9999 9999 28.9 1006.8 26.7 24.6 9.3 150   8  98   2   8   5 230 380964111905000000
4   0007 1966  8  1  0 -35.5 52.7 9999 9999 13.1 1002.4   14   14 8.7 120   9  91  61   4   3 128 509044111903000000
5  00075 1966  8  1  0 -56.2 44.7 9999 9999 9999 1017.7 15.6 14.6 2.6 330   2  99   2   0   5 230 480034111905000000
6   0008 1966  8  1  0 -58.7 49.1 9999 9999 13.9 1010.5 15.4 12.4 9.8 250   1  98   1   0   2 128 498014111902000000
7   0010 1966  8  1  0 -67.4 42.7 9999 9999 16.7   1017 14.7 13.1 4.1  80   7  98   2   2   2 128 472724111902000000
8   0012 1966  8  1  0   -50 47.5 9999 9999 13.7 1008.8   14 12.9   1 340   1  98   2   0   2 128 490904111902000000
9   0013 1966  8  1  0 133.9 83.9 9999 9999 9999 1011.7    0    0 4.1 340   8  94  10   4   9 186 622334111909000001
10 00131 1966  8  1  0 -15.7   67 9999 9999 9999 1014.7  4.4 9999 4.6 340   7  99   2   4   5 230 563244111905000000

dup     
                   V1                 V8
1  391764111900000000 370154111900000000
2  391724111900000000 391784111900000000
3  388204111900000000 388434111900000000
4  384824111900000000 384654111900000000
5  380974111900000000 377374111905000000
6  377704111900000000 374044111900000000
7  370134111905000000 366534111900000000
8  381064111900000000 399064111900000000
9  370124111900000000 370134111900000000
10 373744111900000000 370154111900000000

我想在两个列中找到所有的V2-V21的值,最终得到这样的列表

list[[1]]
    1   0003 1966  8  1  0   -48   35 9999 9999 25.2 1022.8 25.1 24.1 6.2 270   6  98   2   1   3 128 391764111900000000

    2  00013 1966  8  1  0 -30.8 35.1 9999 9999   23   1027   22 20.5   0 361   3  98   2   0   5 254 370154111900000000

列表[[2]]

有人可以帮助我吗?

非常感谢

1 个答案:

答案 0 :(得分:1)

match对于这类事情很方便。

merged <- apply(dup, 1, function(x) all[match(x, all$V22), ])

merged[[1]] # just the first element of the output

#   V1   V2 V3 V4 V5    V6   V7   V8   V9  V10    V11  V12  V13 V14 V15
# 2  3 1966  8  1  0 -48.0 35.0 9999 9999 25.2 1022.8 25.1 24.1 6.2 270
# 1 13 1966  8  1  0 -30.8 35.1 9999 9999 23.0 1027.0 22.0 20.5 0.0 361
#   V16 V17 V18 V19 V20 V21                V22
# 2   6  98   2   1   3 128 391764111900000000
# 1   3  98   2   0   5 254 370154111900000000

它在第一个向量的每个元素的第二个向量中找到位置。我们可以在apply的第一维(行)match dup和返回的索引的子集allnrow(dup) data.frame。这将返回一个包含{{1}}元素的列表,每个元素都是{{1}},有两行。