合并来自不同表的数据

时间:2013-09-29 01:35:23

标签: r merge plyr

我有两个不同的表,如下所示(第一个表有部分信息,第二个表有崩溃信息。第一个表有所有部分信息,一些部分没有任何崩溃事件,没有关于那些的崩溃信息第二个表中的部分。合并表需要计算总崩溃次数。如果没有崩溃,该部分将需要插入零值。):

a <- structure(list(CSECT = c("001-01", "001-01", "001-01", "001-04", "001-01", "001-01", "001-02", "001-02", "001-03", "001-04"), 
From = c("0", "1", "3", "4", "5", "7", "8", "1", "2.2", "3.4"), 
To = c("1", "3", "4", "5", "6", "8", "9", "2.2", "3.4", "4.5")),
.Names = c("CSECT", "From", "To"), row.names = c(NA, -10L), class = "data.frame")
a

  CSECT    From  To
1  001-01    0   1
2  001-01    1   3
3  001-01    3   4
4  001-04    4   5
5  001-01    5   6
6  001-01    7   8
7  001-02    8   9
8  001-02    1 2.2
9  001-03  2.2 3.4
10 001-04  3.4 4.5

b <- structure(list(CSECT = c("001-01", "001-01", "001-01", "001-01", "001-01", "001-01", "001-01", "001-01", "001-01", "001-01",
"001-02", "001-02","001-02","001-02","001-02","001-02","001-02"), 
From = c("0", "0", "0", "0", "1", "1", "3", "3", "3", "3", "8", "8", "8", "8","1", "1", "1"), 
To = c("1", "1", "1", "1", "3", "3", "4", "4", "4", "4", "9", "9", "9", "9", "2.2", "2.2", "2.2"), 
CrashID = c("3409", "3410", "6790", "1100", "1200", "5609", "6730", "1220", "1234", "1239",
"4409", "5610", "6794", "1123", "1245", "5634", "6732")),
.Names = c("CSECT", "From", "To", "CrashID"), row.names = c(NA, -17L), class = "data.frame")
b

   CSECT   From  To CrashID
1  001-01    0   1    3409
2  001-01    0   1    3410
3  001-01    0   1    6790
4  001-01    0   1    1100
5  001-01    1   3    1200
6  001-01    1   3    5609
7  001-01    3   4    6730
8  001-01    3   4    1220
9  001-01    3   4    1234
10 001-01    3   4    1239
11 001-02    8   9    4409
12 001-02    8   9    5610
13 001-02    8   9    6794
14 001-02    8   9    1123
15 001-02    1 2.2    1245
16 001-02    1 2.2    5634
17 001-02    1 2.2    6732

我喜欢合并以下数据:

CSECT   From    To  Count
001-01    0     1       4
001-01    1     3       2
001-01    3     4       4
001-04    4     5       0
001-01    5     6       0
001-01    7     8       0
001-02    8     9       4
001-02    1     2.2     3
001-03  2.2     3.4     0
001-04  3.4     4.5     0

提前致谢。

1 个答案:

答案 0 :(得分:1)

 cc <- by( b[-(1:3)], b[1:3], function(x) list( 
    Total =length(x$CrashID),   
    Fatal =sum(x$Severity=="Fatal"),  
    Severe=sum(x$Severity=="Severe"),  
    PDO=sum(x$Severity=="PDO") ) ,
    simplify=FALSE)
 dd <- cbind(unique(b[1:3]), do.call(rbind, cc) )
 ee < -merge(a, dd, all.x=TRUE)
 ee[ is.na(ee) ] <- 0
 ee
#-----------

    CSECT From  To AADT Total Fatal Severe PDO
1  001-01    0   1 1100     4     1      0   3
2  001-01    1   3 1200     3     0      0   3
3  001-01    3   4  890     2     0      0   2
4  001-01    5   6 2000     0     0      0   0
5  001-01    7   8 5000     0     0      0   0
6  001-02    1 2.2 2000     4     0      0   4
7  001-02    8   9 6700     4     1      1   2
8  001-03  2.2 3.4 3000     0     0      0   0
9  001-04  3.4 4.5 1230     0     0      0   0
10 001-04    4   5 1000     0     0      0   0