我想知道如何对此代码进行矢量化。
dates = list(as.Date(c("2000-02-08", "2000-02-11")),
as.Date(c("2000-03-02", "2000-03-07")),
as.Date(c("2000-03-02", "2000-03-07")),
as.Date(c("2000-03-03", "2000-03-07")),
as.Date(c("2000-03-16", "2000-03-30")),
as.Date(c("2000-03-16")))
i = 2
while(i <= length(dates))
{
if(dates[[i]][1] < dates[[i-1]][2])
{
dates[[i]] = NULL
i = i-1
}
i = i+1
}
我想只得到几个不相交的日期。
Date1 = as.Date(c("2000-03-02", "2000-03-07"))
Date2 = as.Date(c("2000-03-03", "2000-03-07"))
例如,如果Date2包含在Date1的范围内,那么我们删除Date2。
答案 0 :(得分:1)
来自包foverlaps
的{{1}}:
data.table
输出(我添加的情况略有不同):
dates = list(as.Date(c("2000-02-08", "2000-02-11")),
as.Date(c("2000-03-02", "2000-03-07")),
as.Date(c("2000-03-02", "2000-03-05")),
as.Date(c("2000-03-09", "2000-03-15")),
as.Date(c("2000-03-16", "2000-03-30")),
as.Date(c("2000-03-16")))
dt<-as.data.table(do.call(rbind,dates))
setkey(dt)
# Get id of the ranges within others
tmp <- foverlaps(dt,dt,which=T,type="within")[,xid]
# summarize this
t<-table(tmp)
# Filter for ranges appearing only once, hence not included in another one.
res <- dt[ as.integer(names(t[t==1])) , ]
# not aboslutely necessary, but it's to retrieve date objects which were converted by the rbind call.
res[, `:=`( V1=as.Date(V1,origin="1970-01-01"), V2=as.Date(V2, origin="1970-01-01"))][]
如果您希望排除任何交叉点,请在foverlaps调用中设置 V1 V2
1: 2000-02-08 2000-02-11
2: 2000-03-02 2000-03-07
3: 2000-03-09 2000-03-15
4: 2000-03-16 2000-03-30
以获取此输出:
type="any"
答案 1 :(得分:0)
取决于您正在寻找的方向。在我的示例中,您查看以下任何数据行是否重叠(我只是查看开始日期,但您可以扩展它)。
dates = list(as.Date(c("2000-02-08", "2000-02-11")),
as.Date(c("2000-03-02", "2000-03-07")),
as.Date(c("2000-03-02", "2000-03-07")),
as.Date(c("2000-03-03", "2000-03-07")),
as.Date(c("2000-03-16", "2000-03-30")),
as.Date(c("2000-03-16")))
m <- do.call(rbind,dates)
rem <- sapply(seq_along(m[,1]),function(x){any(which(
m[x,1]<m[,2] & m[x,1]>=m[,1])>x)})
m[!rem,]