在R中查看大约日期匹配

时间:2017-10-09 10:56:29

标签: r date dataframe

我有两个R数据帧

Factor = data.frame(date = c("2015-10-01", "2016-01-01", "2016-04-01", 
"2016-07-01", "2016-10-01", "2017-01-01"), factor = c(0.07606455, 
0.07170356, 0.07127930, 0.06807735, 0.06764824, 0.06709560))

Factor =
date       factor
2015-10-01 0.07606455
2016-01-01 0.07170356
2016-04-01 0.07127930
2016-07-01 0.06807735
2016-10-01 0.06764824
2017-01-01 0.06709560

Dates = data.frame(date = c("2016-01-01", "2016-01-28", "2016-01-29", 
"2016-03-01", "2016-06-02", "2016-07-03", "2016-10-04", "2016-10-05"))

Dates = 
date       
2016-01-01 
2016-01-28 
2016-01-29 
2016-03-01 
2016-06-02 
2016-07-03 
2016-10-04
2016-10-05

我正在寻找一种Excel类型Vlookup来进行近似匹配。由于日期不准确,我无法执行R merge功能。 此外,我无法匹配link中的索引或使用下面的最小日期差异

apply(Dates, 1, function(x) min(which(abs(x - Factor$date) == min(abs(x - Factor$date))))) 

因为我需要因子数据帧中的因子,该因子小于或等于日期数据帧的日期。 我想要的输出应该是

Output = 
date       factor  
2016-01-01 0.07170356
2016-01-28 0.07170356
2016-01-29 0.07170356
2016-03-01 0.07170356
2016-06-02 0.07127930
2016-07-03 0.06807735
2016-10-04 0.06764824
2016-10-05 0.06764824

除了循环以外,是否有任何有效的方法来实现结果

2 个答案:

答案 0 :(得分:2)

data.table方法如何:

library(data.table)
setDT(Dates)[, date := as.IDate(date)]
setDT(Factor)[, date := as.IDate(date)]
Factor[Dates, on = "date", roll = Inf]
#          date     factor
# 1: 2016-01-01 0.07170356
# 2: 2016-01-28 0.07170356
# 3: 2016-01-29 0.07170356
# 4: 2016-03-01 0.07170356
# 5: 2016-06-02 0.07127930
# 6: 2016-07-03 0.06807735
# 7: 2016-10-04 0.06764824
# 8: 2016-10-05 0.06764824

对于Dates中的每个日期,这将与Factor中较低/相等最近的日期匹配并获取其factor

答案 1 :(得分:1)

也许你可以创建一个包含所有键的数据框,连接你拥有它们的值(“因子”),并为所有键使用一个循环(而不是每行一个循环)

t1 <- data.frame(a=c(1, 3, 6), b=c(1, 1, 2))
t2 <- data.frame(a=c(1, 2, 4, 5, 7))
tsum <- data.frame(a=sort(unique(c(t1$a, t2$a))))
tmerge <- merge(tsum, t1, all.x=TRUE)
for (i in c(1:nrow(tmerge))){if(is.na(tmerge$b[i])){tmerge$b[i]=tmerge$b[i-1]}}