如何在R中组合具有各种矢量长度的两个数据帧

时间:2014-03-09 10:41:12

标签: r sorting merge

我有两个我想要组合的data.frames。 第一个data.frame看起来像这样:

date1 <- c("2012-01-01","2012-01-02","2012-01-03","2012-01-04","2012-01-05","2012-01-01","2012-01-02","2012-01-03","2012-01-04","2012-01-05")
company1 <- c("A","A","A","A","A","B","B","B","B","B")
ret1 <- c(-0.01, -0.013, 0.02, 0.032, -0.002, 0.022, 0.012, 0.031, -0.018, -0.034)

mydf1 <- data.frame(date1, company1, ret1)
mydf1

#         date1 company1   ret1
# 1  2012-01-01        A -0.010
# 2  2012-01-02        A -0.013
# 3  2012-01-03        A  0.020
# 4  2012-01-04        A  0.032
# 5  2012-01-05        A -0.002
# 6  2012-01-01        B  0.022
# 7  2012-01-02        B  0.012
# 8  2012-01-03        B  0.031
# 9  2012-01-04        B -0.018
# 10 2012-01-05        B -0.034

第二个data.frame看起来像这样:

date2 <- c("2012-01-02","2012-01-04","2012-01-05","2012-01-01","2012-01-04")
company2 <- c("A","A","A","B","B")
class2 <- c("p", "p", "x", "n", "x")

mydf2 <- data.frame(date2, company2, class2)
mydf2

#        date2 company2 class2
# 1 2012-01-02        A      p
# 2 2012-01-04        A      p
# 3 2012-01-05        A      x
# 4 2012-01-01        B      n
# 5 2012-01-04        B      x

所以第一行和第二行实际上是一样的:日期和公司名称。现在我想将行“class2”添加到我的第一个数据框。当然我希望课程在正确的行中。新的data.frame应如下所示:

#         date1 company1   ret1  class2
# 1  2012-01-01        A -0.010  
# 2  2012-01-02        A -0.013    p
# 3  2012-01-03        A  0.020
# 4  2012-01-04        A  0.032    p
# 5  2012-01-05        A -0.002    x
# 6  2012-01-01        B  0.022    n
# 7  2012-01-02        B  0.012    
# 8  2012-01-03        B  0.031
# 9  2012-01-04        B -0.018    x
# 10 2012-01-05        B -0.034

1 个答案:

答案 0 :(得分:0)

您已标记此merge - 您是否尝试过此选项?

merge(mydf1, mydf2, by.x = c("date1", "company1"), 
      by.y = c("date2", "company2"), all.x = TRUE)
#         date1 company1   ret1 class2
# 1  2012-01-01        A -0.010   <NA>
# 2  2012-01-01        B  0.022      n
# 3  2012-01-02        A -0.013      p
# 4  2012-01-02        B  0.012   <NA>
# 5  2012-01-03        A  0.020   <NA>
# 6  2012-01-03        B  0.031   <NA>
# 7  2012-01-04        A  0.032      p
# 8  2012-01-04        B -0.018      x
# 9  2012-01-05        A -0.002      x
# 10 2012-01-05        B -0.034   <NA>

如果您想保留已显示的订单,也许您可​​以在合并之前添加要排序的列:

mydf1$rn <- sequence(nrow(mydf1))
out <- merge(mydf1, mydf2, by.x = c("date1", "company1"), 
             by.y = c("date2", "company2"), all.x = TRUE)
out[order(out$rn), ]