Vectorise找到最近的日期函数

时间:2012-12-19 06:20:02

标签: r date

我想传递一个日期向量,并从第二个(部分匹配)日期向量返回最接近的日期。

以下函数执行我对单个日期所需的功能,但是我无法弄清楚如何将此概括为searchDate是日期向量的情况。

closestDate <- function(searchDate, dateList, roundDown=FALSE){
  if (roundDown) {
    dist2date <- as.Date(dateList) - as.Date(searchDate)
    closest <- which(max(dist2date[dist2date<=0]) == dist2date)
  } else {
    dist2date <- as.Date(dateList) - as.Date(searchDate)
    closest <- which(min(dist2date[dist2date>=0]) == dist2date)
  }
  return(dateList[closest])
}

dateSeq <- seq(as.Date("2011-01-01"), as.Date("2012-12-19"), by='day')
oddDates <- dateSeq[as.logical(1:length(dateSeq) %%2)]

closestDate('2012-12-14', oddDates)
[1] "2012-12-15"

miscDatesLong <- rep(c('2012-12-14', '2012-12-16', '2012-12-18'), 100 )
closestDate(miscDatesLong, oddDates)

closestDate(miscDatesLong, oddDates)
[1] "2012-12-15" "2012-12-17" "2012-12-19"
Warning message:
In unclass(time1) - unclass(time2) :
  longer object length is not a multiple of shorter object length

有人可以帮忙吗?

6 个答案:

答案 0 :(得分:5)

findInterval功能可以快速完成此任务:

dateSeq <- seq(as.Date("2011-01-01"), as.Date("2012-12-19"), by='day')
oddDates <- dateSeq[as.logical(1:length(dateSeq) %%2)]

oddDates[ findInterval(as.Date('2012-12-14'), oddDates)+1 ]

miscDatesLong <- rep(c('2012-12-14', '2012-12-16', '2012-12-18'), 100)

oddDates[ findInterval(as.Date(miscDatesLong), oddDates) + 1 ]

要向下舍入而不是向上删除+1。如果您确实想要找到最接近的日期,而不是在您创建新的日期列表之前或之后的日期,这些日期是间隔(as.Date(rowMeans(embed(as.numeric(oddDates),2)), '1960-01-01'))的中点并在其上使用findInterval。有关其他选项,请参阅findInterval的参数。

答案 1 :(得分:4)

?Vectorize

> closestDateV = Vectorize(closestDate,"searchDate")
> closestDateV(c('2012-12-15','2012-12-14'), oddDates)
2012-12-15 2012-12-14 
     15689      15689 

返回的值已删除其日期。所以加回来:

> as.Date(closestDateV(c('2012-12-15','2012-12-14'), oddDates),origin="1970-01-01")
  2012-12-15   2012-12-14 
"2012-12-15" "2012-12-15" 

您可能希望将其全部包含在新功能中。

功能编程很有趣!

答案 2 :(得分:3)

现在,通过这个例子,只需处理小于一种情况或大于另一种情况的日期子集,即当时正在检查的特定目标。

closestDt <- function(searchDate, dateList, roundDown=FALSE) 
     as.Date( sapply( searchDate , function (x) if( roundDown ){ 
                max( dateList[ dateList <= x ] ) } else {
                min( dateList[ dateList >= x])  } 
           ), "1970-01-01")

答案 3 :(得分:2)

# initiate a tie-breaking function
tie.breaker <-
    function( x , y , la = look.after ){

        # if look.after is TRUE, eliminate all values below x
        # otherwise, eliminate all values above x
        if ( la ) y[ y < x ] <- NA else y[ y > x ] <- NA

        # then among the remaining values, figure out the date the shortest distance away
        z <- which.min( abs( x - y ) )[1]
        # use [1] to just take the first result, in case y contains duplicate dates

        # return z
        return( z )
    }

# initiate your main function
closestDate <- 
    function( searchDate , dateList , look.after = FALSE ){

        # apply a which.min( abs( ) ) command to each of the dates given, 
        # across every date in the larger list
        dist2date <- 
            sapply( 

                # on every element of searchDate..
                as.Date( searchDate ) ,

                # ..run the tie.breaker() function
                tie.breaker , 

                # and each time, pass in the dateList
                as.Date( dateList ) ,

                # and also the look.after TRUE/FALSE flag
                look.after
            )

        # return the matching dates in the same order as passed in
        dateList[ dist2date ]
    }

# try with two input dates
searchDate <- c( '2012-12-14' , '2012-11-18' )

# create a few dates to test against..
someDates <- c( '2012-11-12' ,  '2012-11-17' , '2012-12-15' , '2012-12-13' , '2012-12-15' , '2012-11-17' , '2012-11-20' )

# return the two dates closests to the inputted dates

# the first result gives 12/13, because look.after = FALSE
closestDate( searchDate , someDates )

# the first result gives 12/15, because look.after = TRUE
closestDate( searchDate , someDates , look.after = TRUE )

# reverse the order to prove it still works
someDates <- c( '2012-11-12' , '2012-11-17' , '2012-12-13' , '2012-12-15' , '2012-12-13' , '2012-12-15' , '2012-11-17' )

# the first result gives 12/13, because look.after = FALSE
closestDate( searchDate , someDates )

# the first result gives 12/15, because look.after = TRUE
closestDate( searchDate , someDates , look.after = TRUE )

答案 4 :(得分:2)

您可以使用cut

nearestDate <- function(dates,datesToMatch)
{
        dtm <- sort(datesToMatch)
        dtmMid <- dtm[-length(dtm)]+diff(dtm)/2
        as.Date(cut(dates,
        breaks=c(as.Date("1970-01-01"),
        dtmMid,as.Date("2100-01-01")),labels=dtm))
}

dates1 <- as.Date(c("2012-02-14","2012-06-23","2012-08-27","2012-12-01"))
dates2 <- as.Date(c("2012-04-01","2012-10-31","2012-12-25"))
nearestDate(dates1,dates2)
[1] "2012-04-01" "2012-04-01" "2012-10-31" "2012-12-25"

请注意,由于它不接受+/- Inf,因此我必须为剪切函数中的终点选择一些魔术日期。根据您的需要进行修改。

答案 5 :(得分:2)

我认为这就是你想要的:

closestDate <- function(searchDate, dateList, roundDown=FALSE) {
  as.Date(sapply(as.Date(searchDate), function(x){
    dist <- abs(x - as.Date(dateList))
    closest <- dateList[which(min(dist) == dist)]
    return(ifelse(roundDown, min(closest), max(closest)))
  }), origin="1970-1-1")
}
sapply是你的朋友。您只需确保返回日期而不是整数。