在R

时间:2017-04-14 07:24:14

标签: r dataframe merge override

我目前正在使用R中的48种货币,并希望合并或覆盖两个数据集。

对于所有货币,我在DF1中拥有1980年至2017年的数据。对于其中的16个,我也有DF2中1970/1971至2017年的数据。我想要做的是将DF19的1970-1980部分放在DF1之上。如果我设法将DF1合并到DF2中以获得它们都具有值的单元格(因此DF1覆盖DF2),我想我会得到相同的结果。但是,所有货币的起始日期并不完全相同,所以我不能只对它进行硬编码。

这是一个示例,向您展示它的样子: Date是时间变量(月度数据)。 DF1将对应于1980-2017数据,而DF2将对应于1970-2017数据。我的目标是根据列ID和行ID覆盖DF1到DF2的值。 DF3是所需的输出。

DF1=data.frame(matrix(data=c(c(4:9),rnorm(12)),6,3))
DF2=data.frame(matrix(data=c(c(1:12),rnorm(36)),12,4))
names(DF1)=c("Date","Currency1","Currency3")
names(DF2)=c("Date","Currency1","Currency2","Currency3")

> DF1
  Date  Currency1  Currency3
1    4 -0.8356286  0.5757814
2    5  1.5952808 -0.3053884
3    6  0.3295078  1.5117812
4    7 -0.8204684  0.3898432
5    8  0.4874291 -0.6212406
6    9  0.7383247 -2.2146999

> DF2
   Date   Currency1   Currency2  Currency3
1     1  1.12493092 -0.15579551  1.1000254
2     2 -0.04493361 -1.47075238  0.7631757
3     3 -0.01619026 -0.47815006 -0.1645236
4     4  0.94383621  0.41794156 -0.2533617
5     5  0.82122120  1.35867955  0.6969634
6     6  0.59390132 -0.10278773  0.5566632
7     7  0.91897737  0.38767161 -0.6887557
8     8  0.78213630 -0.05380504 -0.7074952
9     9  0.07456498 -1.37705956  0.3645820
10   10 -1.98935170 -0.41499456  0.7685329
11   11  0.61982575 -0.39428995 -0.1123462
12   12 -0.05612874 -0.05931340  0.8811077

> DF3
   Date   Currency1   Currency2  Currency3
1     1  1.12493092 -0.15579551  1.1000254
2     2 -0.04493361 -1.47075238  0.7631757
3     3 -0.01619026 -0.47815006 -0.1645236
4     4 -0.83562861  0.41794156  0.5757814
5     5  1.59528080  1.35867955 -0.3053884
6     6  0.32950777 -0.10278773  1.5117812
7     7 -0.82046838  0.38767161  0.3898432
8     8  0.48742905 -0.05380504 -0.6212406
9     9  0.73832471 -1.37705956 -2.2146999
10   10 -1.98935170 -0.41499456  0.7685329
11   11  0.61982575 -0.39428995 -0.1123462
12   12 -0.05612874 -0.05931340  0.8811077

2 个答案:

答案 0 :(得分:2)

我们可以使用join

中的data.table
library(data.table)
DF3 <- copy(DF2)
nm1 <- names(DF1)[-1]
setDT(DF3)[DF1, (nm1) := mget(paste0("i.", nm1)), on = .(Date)]
DF3
#     Date   Currency1   Currency2  Currency3
# 1:    1  1.12493092 -0.15579551  1.1000254
# 2:    2 -0.04493361 -1.47075238  0.7631757
# 3:    3 -0.01619026 -0.47815006 -0.1645236
# 4:    4 -0.83562860  0.41794156  0.5757814
# 5:    5  1.59528080  1.35867955 -0.3053884
# 6:    6  0.32950780 -0.10278773  1.5117812
# 7:    7 -0.82046840  0.38767161  0.3898432
# 8:    8  0.48742910 -0.05380504 -0.6212406
# 9:    9  0.73832470 -1.37705956 -2.2146999
#10:   10 -1.98935170 -0.41499456  0.7685329
#11:   11  0.61982575 -0.39428995 -0.1123462
#12:   12 -0.05612874 -0.05931340  0.8811077

答案 1 :(得分:2)

您还可以使用plyr:ldply,然后group_bysummarise_each按日期(使用dplyr包)强制对两个数据框进行行绑定:

df <- ldply(list(DF1, DF2))
sums <- function(x) sum(x, na.rm=T)
df %>% group_by(Date) %>% summarise_each(funs(sums))

# A tibble: 12 × 4
    Date   Currency1  Currency3   Currency2
   <dbl>       <dbl>      <dbl>       <dbl>
1      1 -0.62124058 -0.3942900  0.61982575
2      2 -2.21469989 -0.0593134 -0.05612874
3      3  1.12493092  1.1000254 -0.15579551
4      4 -0.67138742  1.2506048 -1.47075238
5      5  0.16745306  0.5738011 -0.47815006
6      6  0.10820760  0.3224197  0.41794156
7      7  2.41650200  0.3915750  1.35867955
8      8  0.92340909  2.0684444 -0.10278773
9      9  0.09850899 -0.2989125  0.38767161
10    10  0.78213630 -0.7074952 -0.05380504
11    11  0.07456498  0.3645820 -1.37705956
12    12 -1.98935170  0.7685329 -0.41499456