我想在向量中找到位置,其中值与向量中较早的点相差超过某个阈值。应该相对于矢量中的第一个值来测量第一个变化点。应该相对于先前的变化点测量后续变化点。
我可以使用var food = {"Non-Animal":{"Plants":{"Vegetables":{}},"Minerals":{}},"Animal":{}}
function add(key, value, object) {
key.split('.').reduce(function(r, e, i, arr) {
if(r[e] && i == arr.length - 1) Object.assign(r[e], value);
return r[e]
}, object)
}
add('Non-Animal.Plants', {'Fruits': {}}, food)
console.log(food)
循环执行此操作,但我想知道是否存在更惯用且更快速的矢量化灵魂。
最小例子:
for
答案 0 :(得分:3)
在Rcpp
中实现相同的代码可以提高速度。
library(Rcpp)
cppFunction(
"IntegerVector foo(NumericVector vect, double difference){
int start = 0;
IntegerVector changepoints;
for (int i = 0; i < vect.size(); i++){
if((vect[i] - vect[start]) > difference || (vect[start] - vect[i]) > difference){
changepoints.push_back (i+1);
start = i;
}
}
return(changepoints);
}"
)
foo(vect = x, difference = mindiff)
# [1] 17 25 56 98 108 144 288 297 307 312 403 470 487
identical(foo(vect = x, difference = mindiff), changepoints)
#[1] TRUE
<强>基准强>
#DATA
set.seed(123)
x = cumsum(rnorm(1e5))
mindiff = 5.0
library(microbenchmark)
microbenchmark(baseR = {start = x[1]
changepoints = integer()
for (i in 1:length(x)) {
if (abs(x[i] - start) > mindiff) {
changepoints = c(changepoints, i)
start = x[i]
}
}}, Rcpp = foo(vect = x, difference = mindiff))
#Unit: milliseconds
# expr min lq mean median uq max neval cld
# baseR 117.194668 123.07353 125.98741 125.56882 127.78463 139.5318 100 b
# Rcpp 7.907011 11.93539 14.47328 12.16848 12.38791 263.2796 100 a
答案 1 :(得分:3)
这是一个仅使用baseR Reduce
的解决方案。使用参数accumulate = TRUE
,reduce返回每次调用函数的结果。在我们的示例中,它将使用start
循环表示解决方案的for
值。一旦你有了这个向量,我们只需要找到值改变的索引:
#Find the changepoints
r <- Reduce(function(a,e) {
if (abs(e - a) > mindiff)
e
else
a
}, x, accumulate =T)
# Get the indexes using diff
# changepoints <- head(cumsum(c(1,rle(r)$lengths)),-1)
changepoints <- which(!diff(r) == 0) + 1
修改强>: 我使用@Eric Watt的评论更新了答案。
答案 2 :(得分:0)
为了完整性,使用递归,我们可以得到仅使用R向量化函数的答案。 然而,这不适用于大型结果向量。例如。在OP示例中,当length(x)== 1e5
时,我们得到“嵌套太深的评估”错误N = length(x)
f.recurs = function(x, mindiff, i=1) {
next.i = i + which(abs(x[i:N]-x[i]) > mindiff)[1] - 1L
if (!is.na(next.i)) c(next.i, f.recurs(x, mindiff, next.i))
else NULL
}
f.recurs(x, 5.0)
# [1] 17 25 56 98 108 144 288 297 307 312 403 470 487