我有一个名为features的数据表,其列为nightNo,HR,运动和角度。我想获取之前HR的600个点的滚动方差,每晚的运动和角度我想出了以下功能来做到这一点:
features <- data.table(nightNo=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2),
HR=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14),
motion=c(14,13,12,11,10,9,8,7,6,5,4,3,2,1),
angle=c(2,4,6,8,10,12,14,16,18,20,22,24,26,28))
# For the example I'll use a window of 6 instead of 600
window = 6
features[, c("HR_Variance", "motion_Variance", "angle_Variance") :=
list(rollapply(HR, window, var, partial=TRUE, align = "right"),
rollapply(motion, window, var, partial=TRUE, align = "right"),
rollapply(angle, window, var, partial=TRUE, align = "right")), by=nightNo ]
# nightNo HR motion angle HR_Variance motion_Variance angle_Variance
# 1: 1 1 14 2 NA NA NA
# 2: 1 2 13 4 0.500000 0.500000 2.000000
# 3: 1 3 12 6 1.000000 1.000000 4.000000
# 4: 1 4 11 8 1.666667 1.666667 6.666667
# 5: 1 5 10 10 2.500000 2.500000 10.000000
# 6: 1 6 9 12 3.500000 3.500000 14.000000
# 7: 1 7 8 14 3.500000 3.500000 14.000000
# 8: 2 8 7 16 NA NA NA
# 9: 2 9 6 18 0.500000 0.500000 2.000000
# 10: 2 10 5 20 1.000000 1.000000 4.000000
# 11: 2 11 4 22 1.666667 1.666667 6.666667
# 12: 2 12 3 24 2.500000 2.500000 10.000000
# 13: 2 13 2 26 3.500000 3.500000 14.000000
# 14: 2 14 1 28 3.500000 3.500000 14.000000
结果是正确的,但是由于我有一个很大的数据集,它可以永远运行。我还做了其他类似的功能,它们在每晚相同的600个窗口上使用runmeans和sapplys否,它们在合理的时间内运行,这使我认为rollapply或方差函数非常慢。有没有办法通过更改var或rollapply函数来使此代码更高效?
答案 0 :(得分:0)
我不知道rollaplly在做什么,但是我使用并行的tidyverse在给定的样本数据上生成此输出,
library(cumstats)
library(tidyverse)
library(furrr)
plan(multiprocess)
window <- 6
features %>%
nest(-nightNo) %>%
mutate(data=future_map(data,~mutate_at(.,vars(HR, motion,angle),
funs(var=cumvar(.)[c(1:window,rep(window,length(.)-length(1:window)))])))) %>%
unnest()
# A tibble: 14 x 7
nightNo HR motion angle HR_var motion_var angle_var
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 14 2 NA NA NA
2 1 2 13 4 0.5 0.5 2
3 1 3 12 6 1 1 4
4 1 4 11 8 1.67 1.67 6.67
5 1 5 10 10 2.5 2.5 10
6 1 6 9 12 3.5 3.5 14
7 1 7 8 14 3.5 3.5 14
8 2 8 7 16 NA NA NA
9 2 9 6 18 0.5 0.5 2
10 2 10 5 20 1 1 4
11 2 11 4 22 1.67 1.67 6.67
12 2 12 3 24 2.5 2.5 10
13 2 13 2 26 3.5 3.5 14
14 2 14 1 28 3.5 3.5 14