我想要合并两个数据框;但是,我只想保留一个约会。 df1将是2013年1月1日至2016年10月1日的月份.df2将包含事件发生的频率。如果该月没有事件,则df2将不显示值。
df1< - data.frame(date = seq(as.Date(" 2013-01-01"),as.Date(" 2016-10-01") ,by =" month"))
df1
date Freq
1 2013-01-01 0
2 2013-02-01 0
3 2013-03-01 0
4 2013-04-01 0
5 2013-05-01 0
...
df2
date Freq
1 2013-03-01 1
2 2013-08-01 2
3 2014-04-01 5
4 2014-05-01 2
5 2014-06-01 5
...
我希望新数据框看起来如下所示。
date Freq
1 2013-01-01 0
2 2013-02-01 0
3 2013-03-01 1
4 2013-04-01 0
5 2013-05-01 0
6 2013-06-01 0
7 2013-07-01 0
8 2013-08-01 2
9 2013-09-01 0
...
答案 0 :(得分:0)
您可merge
使用all.x=TRUE
,然后将合并产生的NA
设置为零:
out <- merge(df1,df2,all.x=TRUE)
out[is.na(out)] <- 0
head(out,10)
## date Freq
##1 2013-01-01 0
##2 2013-02-01 0
##3 2013-03-01 1
##4 2013-04-01 0
##5 2013-05-01 0
##6 2013-06-01 0
##7 2013-07-01 0
##8 2013-08-01 2
##9 2013-09-01 0
##10 2013-10-01 0
数据:在OP中创建df1
:
df1 <- data.frame(date=seq(as.Date("2013-01-01"), as.Date("2016-10-01"), by="month"))
df1 <- structure(list(date = structure(c(15706, 15737, 15765, 15796,
15826, 15857, 15887, 15918, 15949, 15979, 16010, 16040, 16071,
16102, 16130, 16161, 16191, 16222, 16252, 16283, 16314, 16344,
16375, 16405, 16436, 16467, 16495, 16526, 16556, 16587, 16617,
16648, 16679, 16709, 16740, 16770, 16801, 16832, 16861, 16892,
16922, 16953, 16983, 17014, 17045, 17075), class = "Date")), .Names = "date", row.names = c(NA,
-46L), class = "data.frame")
## date
##1 2013-01-01
##2 2013-02-01
##3 2013-03-01
##4 2013-04-01
##5 2013-05-01
## ...
##42 2016-06-01
##43 2016-07-01
##44 2016-08-01
##45 2016-09-01
##46 2016-10-01
df2 <- structure(list(date = structure(c(15765, 15918, 16161, 16191,
16222), class = "Date"), Freq = c(1L, 2L, 5L, 2L, 5L)), .Names = c("date",
"Freq"), row.names = c(NA, -5L), class = "data.frame")
## date Freq
##1 2013-03-01 1
##2 2013-08-01 2
##3 2014-04-01 5
##4 2014-05-01 2
##5 2014-06-01 5
答案 1 :(得分:0)
使用dplyr进行连接,
library(dplyr)
full_join(df1, df2) %>%
group_by(date) %>%
summarise(Freq = sum(Freq))
## # A tibble: 9 × 2
## date Freq
## <date> <int>
## 1 2013-01-01 0
## 2 2013-02-01 0
## 3 2013-03-01 1
## 4 2013-04-01 0
## 5 2013-05-01 0
## 6 2013-08-01 2
## 7 2014-04-01 5
## 8 2014-05-01 2
## 9 2014-06-01 5
或基础等价物,
aggregate(Freq ~ date, merge(df1, df2, all = TRUE), sum)
## date Freq
## 1 2013-01-01 0
## 2 2013-02-01 0
## 3 2013-03-01 1
## 4 2013-04-01 0
## 5 2013-05-01 0
## 6 2013-08-01 2
## 7 2014-04-01 5
## 8 2014-05-01 2
## 9 2014-06-01 5
如果你愿意,请在事后订购。
答案 2 :(得分:0)
有data.table方式
library(data.table)
#Create the data
set.seed(1234)
df1 <- data.table(date=seq(as.Date("2013-01-01"), as.Date("2016-10-01"), by="month"))
df2 <- data.table(date=sample(df1$date, size= 10), freq=sample(1:10, 10, replace=T))
#Set keys
setkey(df1, date)
setkey(df2, date)
#data.table magic
df1[df2, freq := freq ]
df1[!df2, freq := 0 ]
df1
结果:
date freq
1: 2013-01-01 3
2: 2013-02-01 0
3: 2013-03-01 0
4: 2013-04-01 0
5: 2013-05-01 0
6: 2013-06-01 7
7: 2013-07-01 0
...