我目前正在使用R中的48种货币,并希望合并或覆盖两个数据集。
对于所有货币,我在DF1中拥有1980年至2017年的数据。对于其中的16个,我也有DF2中1970/1971至2017年的数据。我想要做的是将DF19的1970-1980部分放在DF1之上。如果我设法将DF1合并到DF2中以获得它们都具有值的单元格(因此DF1覆盖DF2),我想我会得到相同的结果。但是,所有货币的起始日期并不完全相同,所以我不能只对它进行硬编码。
这是一个示例,向您展示它的样子:
Date
是时间变量(月度数据)。 DF1将对应于1980-2017数据,而DF2将对应于1970-2017数据。我的目标是根据列ID和行ID覆盖DF1到DF2的值。 DF3是所需的输出。
DF1=data.frame(matrix(data=c(c(4:9),rnorm(12)),6,3))
DF2=data.frame(matrix(data=c(c(1:12),rnorm(36)),12,4))
names(DF1)=c("Date","Currency1","Currency3")
names(DF2)=c("Date","Currency1","Currency2","Currency3")
> DF1
Date Currency1 Currency3
1 4 -0.8356286 0.5757814
2 5 1.5952808 -0.3053884
3 6 0.3295078 1.5117812
4 7 -0.8204684 0.3898432
5 8 0.4874291 -0.6212406
6 9 0.7383247 -2.2146999
> DF2
Date Currency1 Currency2 Currency3
1 1 1.12493092 -0.15579551 1.1000254
2 2 -0.04493361 -1.47075238 0.7631757
3 3 -0.01619026 -0.47815006 -0.1645236
4 4 0.94383621 0.41794156 -0.2533617
5 5 0.82122120 1.35867955 0.6969634
6 6 0.59390132 -0.10278773 0.5566632
7 7 0.91897737 0.38767161 -0.6887557
8 8 0.78213630 -0.05380504 -0.7074952
9 9 0.07456498 -1.37705956 0.3645820
10 10 -1.98935170 -0.41499456 0.7685329
11 11 0.61982575 -0.39428995 -0.1123462
12 12 -0.05612874 -0.05931340 0.8811077
> DF3
Date Currency1 Currency2 Currency3
1 1 1.12493092 -0.15579551 1.1000254
2 2 -0.04493361 -1.47075238 0.7631757
3 3 -0.01619026 -0.47815006 -0.1645236
4 4 -0.83562861 0.41794156 0.5757814
5 5 1.59528080 1.35867955 -0.3053884
6 6 0.32950777 -0.10278773 1.5117812
7 7 -0.82046838 0.38767161 0.3898432
8 8 0.48742905 -0.05380504 -0.6212406
9 9 0.73832471 -1.37705956 -2.2146999
10 10 -1.98935170 -0.41499456 0.7685329
11 11 0.61982575 -0.39428995 -0.1123462
12 12 -0.05612874 -0.05931340 0.8811077
答案 0 :(得分:2)
我们可以使用join
data.table
library(data.table)
DF3 <- copy(DF2)
nm1 <- names(DF1)[-1]
setDT(DF3)[DF1, (nm1) := mget(paste0("i.", nm1)), on = .(Date)]
DF3
# Date Currency1 Currency2 Currency3
# 1: 1 1.12493092 -0.15579551 1.1000254
# 2: 2 -0.04493361 -1.47075238 0.7631757
# 3: 3 -0.01619026 -0.47815006 -0.1645236
# 4: 4 -0.83562860 0.41794156 0.5757814
# 5: 5 1.59528080 1.35867955 -0.3053884
# 6: 6 0.32950780 -0.10278773 1.5117812
# 7: 7 -0.82046840 0.38767161 0.3898432
# 8: 8 0.48742910 -0.05380504 -0.6212406
# 9: 9 0.73832470 -1.37705956 -2.2146999
#10: 10 -1.98935170 -0.41499456 0.7685329
#11: 11 0.61982575 -0.39428995 -0.1123462
#12: 12 -0.05612874 -0.05931340 0.8811077
答案 1 :(得分:2)
您还可以使用plyr:ldply
,然后group_by
和summarise_each
按日期(使用dplyr
包)强制对两个数据框进行行绑定:
df <- ldply(list(DF1, DF2))
sums <- function(x) sum(x, na.rm=T)
df %>% group_by(Date) %>% summarise_each(funs(sums))
# A tibble: 12 × 4
Date Currency1 Currency3 Currency2
<dbl> <dbl> <dbl> <dbl>
1 1 -0.62124058 -0.3942900 0.61982575
2 2 -2.21469989 -0.0593134 -0.05612874
3 3 1.12493092 1.1000254 -0.15579551
4 4 -0.67138742 1.2506048 -1.47075238
5 5 0.16745306 0.5738011 -0.47815006
6 6 0.10820760 0.3224197 0.41794156
7 7 2.41650200 0.3915750 1.35867955
8 8 0.92340909 2.0684444 -0.10278773
9 9 0.09850899 -0.2989125 0.38767161
10 10 0.78213630 -0.7074952 -0.05380504
11 11 0.07456498 0.3645820 -1.37705956
12 12 -1.98935170 0.7685329 -0.41499456