R:假设
length(v)=n length(a)=m=length(b),
n and m are large;
v, a, b may contain NA or NaN's;
a not necessarily smaller than b.
如何找到这样的指数对
a[j] < v[i] < b[j]
如何找到这样的(i,j)的数量
a[j] < v[i] < b[j] or a[j] > v[i] > b[j]
这似乎太慢了:
sumrange <- function(v,ma)
{
s <- 0
for(i in 1:length(v))
{
s <- s + sum(v[i] > ma[,1] & ma[,2] > v[i], na.rm = TRUE)
}
s
}
result <- sumrange(v, cbind(a, b))
编辑:@DatamineR
a<-c(1,6,4,2,NA)
b<-c(5,4,0,7,0)
v<-c(3,5)
问题1中的可能对:
1<3<5 (1,1)
2<3<7 (1,4)
2<5<7 (2,4)
结果= 3
问题2中的可能对:以上所有和
6> 5> 4(2,2)
结果= 3 + 1 = 4
编辑: 实际上它更好的是首先放弃NA的
vc<-na.omit(v)
ma<-na.omit(cbind(a,b))
result<-sumrange(vc,ma)
答案 0 :(得分:0)
也许是这样的?
# some data:
set.seed(123)
a <- sample(1:15, 10)
b <- sample(1:15, 11)
c <- sample(1:15, 10)
a;b;c
[1] 5 12 6 11 14 1 15 8 4 3
[1] 15 7 9 14 2 13 3 1 10 6 5
[1] 11 9 13 8 12 6 10 3 2 14
res <- sapply(b, function(x) apply(cbind(a,c), 1, function(y) (y[1] < x) & (x < y[2])))
which(res, arr.ind = TRUE)
row col
[1,] 1 2
[2,] 3 2
[3,] 10 2
[4,] 1 3
[5,] 3 3
[6,] 10 3
[7,] 6 5
[8,] 10 6
[9,] 6 7
[10,] 1 9
[11,] 3 9
[12,] 10 9
[13,] 1 10
[14,] 10 10
[15,] 6 11
[16,] 10 11
此处,第一列是j
,第二列是i
。
包括两个条件:
res2 <- sapply(b, function(x) apply(cbind(a,c), 1, function(y) ((y[1] < x) & (x < y[2])) | ((y[1] > x) & (x > y[2])) ))
which(res2, arr.ind = TRUE)
row col
[1,] 1 2
[2,] 3 2
[3,] 8 2
[4,] 10 2
[5,] 1 3
[6,] 3 3
[7,] 4 3
[8,] 10 3
[9,] 7 4
[10,] 6 5
[11,] 5 6
[12,] 7 6
[13,] 10 6
[14,] 6 7
[15,] 9 7
[16,] 1 9
[17,] 2 9
[18,] 3 9
[19,] 4 9
[20,] 10 9
[21,] 1 10
[22,] 8 10
[23,] 10 10
[24,] 6 11
[25,] 8 11
[26,] 10 11
答案 1 :(得分:0)
我发现使用带状疱疹的方法稍快一些 如果事先删除了NA,那么效果最好
require(lattice)
vc<-na.omit(v)
ma<-na.omit(cbind(a,b))
sh<-shingle(vc,ma)
res<-sapply(levels(sh), function(x) sum(x[1] < vc & vc <= x[2]))
result<-sum(res)
m = 1000的时间(由na.omit减少到912)并且n = 2000 与for循环(sumrange函数)的0.28相比为0.12,而在应用之前没有清理数据的for循环为0.38。
然而,如果有多个标准,我仍然不知道如何使用带状疱疹:假设v是2乘矩阵,a和b是m乘2矩阵,我们想要计算有多少对(i,j) )这样
(a[j,1]<v[i,1]<b[j,1]) & (a[j,2]<v[i,2]<b[j,2])
当(多维)点位于(多维)矩形时