时间序列,测量密度

时间:2011-12-16 19:41:14

标签: r

我在364天的过程中测量重复事件(E)的每日持续时间(分钟)。

ev1<-c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 2.7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3.27, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 370.33, 1375.4, 
1394.03, 1423.8, 1360, 1269.77, 1378.8, 1350.37, 1425.97, 1423.6, 
1363.4, 1369.87, 1365.5, 1294.97, 1362.27, 1117.67, 1026.97, 
1077.4, 1356.83, 565.23, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 356.83, 
973.5, 0, 240.43, 1232.07, 1440, 1329.67, 1096.87, 1331.37, 1305.03, 
1328.03, 1246.03, 1182.3, 1054.53, 723.03, 1171.53, 1263.17, 
1200.37, 1054.8, 971.4, 936.4, 968.57, 897.93, 1099.87, 876.43, 
1095.47, 1132, 774.4, 1075.13, 982.57, 947.33, 1096.97, 929.83, 
1246.9, 1398.2, 1063.83, 1223.73, 1174.37, 1248.5, 1171.63, 1280.57, 
1183.33, 1016.23, 1082.1, 795.37, 900.83, 1159.2, 992.5, 967.3, 
1440, 804.13, 418.17, 559.57, 563.87, 562.97, 1113.1, 954.87, 
883.8, 1207.1, 1046.83, 995.77, 803.93, 1036.63, 946.9, 887.33, 
727.97, 733.93, 979.2, 1176.8, 1241.3, 1435.6)

ev2<-c(0, 369.3, 158.2, 347.7, 312.5, 265.47, 334.73, 420.83, 816.9, 
925.6, 926.33, 925.4, 917.57, 675.27, 0, 426.03, 860.03, 1041.43, 
947.8, 1076.83, 709.5, 1014.17, 660.3, 428.2, 718.03, 920.8, 
810, 528.53, 103.83, 300.37, 822.03, 662.13, 393.83, 622.47, 
994.13, 1034.07, 893.8, 643.37, 605.07, 360.97, 158.13, 0, 0, 
678.33, 347.67, 384.87, 495.9, 231.37, 443.23, 638.1, 559.53, 
354, 220.13, 210.4, 425.77, 159.5, 260.13, 1132.9, 77.67, 263.83, 
276.23, 63.6, 1.97, 0, 765.2, 403.03, 214.4, 550.63, 752.47, 
58.7, 475.1, 776.4, 53.87, 106.07, 63.23, 425.5, 461.4, 172.73, 
764.8, 53.27, 20.7, 322.8, 228, 36.07, 27.23, 0, 66.3, 389.77, 
705.23, 9.9, 739.3, 883.73, 0, 0, 347.9, 831.43, 0, 28.2, 4.37, 
596.67, 973.7, 26.33, 0.03, 5.93, 777, 918.43, 0, 54.57, 888.13, 
92.83, 98.13, 808.17, 310.5, 263.57, 248.13, 133.37, 138.37, 
14.73, 55.27, 7.17, 242.6, 206.5, 62.97, 8.67, 670.03, 215.77, 
101, 14.07, 440.33, 603.6, 28.27, 257.07, 64.4, 36.4, 506.17, 
333.3, 121.83, 566, 4.33, 192.83, 77.83, 101.3, 261.67, 15.03, 
298.67, 0.3, 616.4, 90.9, 250.87, 323.17, 36.5, 205.2, 205.3, 
110.67, 33.43, 613.43, 95.27, 3.9, 558.7, 650.83, 0, 179.7, 40.6, 
217.13, 48.23, 423.67, 33.9, 176.3, 139.93, 31.63, 0, 162.77, 
311.47, 22.2, 128.3, 0, 304.9, 281.4, 140.73, 131.8, 393.5, 48.63, 
18.17, 232.7, 294.87, 207.6, 317.13, 51.87, 262.57, 70.73, 9.57, 
480.57, 491.37, 27.03, 625.37, 364.4, 0, 79.93, 723.3, 231.57, 
56.93, 836.43, 713.57, 16.8, 2.23, 56.67, 307.87, 466.77, 270.1, 
143.63, 686.23, 703.77, 0, 167.87, 152.6, 237.97, 278.03, 190.7, 
554.03, 37.5, 177.2, 69.2, 119.13, 225.4, 471.23, 7.43, 273.5, 
75.57, 226.73, 141.17, 40.83, 217.33, 238.2, 15.1, 281.27, 244.03, 
0.83, 186.8, 165.53, 142.1, 121.53, 138.83, 103.5, 42.03, 64.27, 
132.07, 26.73, 150.97, 0, 239.9, 100.47, 95.9, 78.23, 90.73, 
172.7, 9.17, 79.77, 67.67, 2.87, 136.73, 362.1, 78.23, 409.37, 
38.9, 62.73, 459.1, 352.6, 17.43, 241.27, 193.1, 278.4, 124.73, 
256.53, 152.6, 247.03, 229.3, 16.5, 73.9, 0, 545.47, 157.5, 182.2, 
276.57, 76.8, 284.43, 2.83, 1.17, 272.57, 314.77, 98.8, 219.93, 
115.23, 121.77, 453.23, 261.73, 101.83, 381, 118.33, 328.23, 
344, 179.5, 16.7, 99.13, 202.97, 57.57, 83.13, 206.87, 425.27, 
130.97, 113.17, 12.07, 207.4, 77.5, 104.7, 59.77, 59.1, 166.6, 
121.2, 139.77, 96.4, 44.23, 262.6, 61.97, 173.2, 281.03, 27.77, 
91.33, 331.23, 142.73, 103.17, 155.7, 80.47, 52.7, 28.6, 56.67, 
257.23, 90.43, 19.43, 69.43, 358.6, 77.9, 15.07, 592.9, 597.27, 
16.83, 225.53, 176.67, 211.47, 159.83, 211, 187.27, 269.73, 27.1, 
421, 83.1, 11.1, 11.67, 253.1, 326.33, 74.33, 153.93, 12.03, 
70.9, 84.47)

两个个体(ev1,ev2)的事件总持续时间大致相同,但是ev2中的时间“spread”更大,ev1中更“集中”

plot(1:364, ev1, type="l", xlab="Days", ylab="Daily Event duration", main="ev1")
plot(1:364, ev2, type="l", xlab="Days", ylab="Daily Event duration", main="ev2")

我想描述或量化日常事件持续时间的时间传播或聚合。有没有标准化的方法来做到这一点?

我在考虑这样的事情:占事件总持续时间x%的最小天数是多少。对于上面的例子,ev2的最小天数比ev1大。有没有办法计算这个?

任何想法或参考都会有所帮助。

非常感谢

2 个答案:

答案 0 :(得分:3)

如果你希望“时间”成为索引,那么使用明确识别出来的表示可能会更容易:

dfrm <- data.frame(tm <-1:364, ev1=ev1, ev2=ev2)

由于您真的对索引(“tm”)值的“密度”感兴趣,请使用权重参数来密度:

 ev1dens <-  density(dfrm$tm, weights=dfrm$ev1/sum(dfrm$ev1), from=0, to=364, n=364)
 plot( ev1dens, lwd=5)
 which.max(ev1dens$y)
#[1] 326
 abline(v=326)  #

现在(幸运的是,密度是单峰的)是按照递减顺序对归一化密度值进行分类并找到cumsum变为> 1的指数的问题。目标比例:

 which(cumsum(ev1dens$y[ order(ev1dens$y, decreasing =TRUE) ])/sum(ev1dens$y) > 0.9)[1]
#[1] 124
 ev1dens$x[order(ev1dens$y, decreasing =TRUE) ][124]
#1] 240.6612

我努力确定将切割点设置为90%包含的位置,但在查看Tommy's answer to your follow-on question之后,我对这种方法准确性的担忧被放大了。 124将是捕获90%的切割点的索引,240将是x值。查看以红色虚线绘制的cumsum过程中经过的下降ev1dens $ y值的绘制顺序,以及最终累积90%水平的绿色垂直线:

 ev1dens <-  density(dfrm$tm, weights=dfrm$ev1/sum(dfrm$ev1), from=0, to=364)
 which(cumsum(ev1dens$y[ order(ev1dens$y, decreasing =TRUE) ])/sum(ev1dens$y) > 0.9)[1]
# [1] 175
 ev1dens$x[order(ev1dens$y, decreasing =TRUE) ][175]
# [1] 240.0548
 idx <- order(ev1dens$y, decreasing =TRUE)
  lines(ev1dens$x[idx], ev1dens$y[idx], lty=3, lwd=2.5, col="red")
  abline(v=240, col="green", lwd=3)

enter image description here

您可以检查tm和两个向量的联合分布。

  require(hexbin)
 hexev1 <- with(dfrm,  hexbin(tm, ev1))
 plot(hexev1)
 hexev2 <- with(dfrm,  hexbin(tm, ev2))
 plot(hexev2)
 plot(hexev1)

获得总数的x%的指数(我认为与上面的聚类完全不同的是:

> min(which(cumsum(ev1) >= sum(ev1)*(x/100) ) )
[1] 317
> min(which(cumsum(ev2) >= sum(ev2)*(x/100) ) )
[1] 112

答案 1 :(得分:0)

这听起来像你想要的时间标准偏差。您可以计算通过观察加权的平均时间(或中位数,使用DWin的答案中的代码,x = 50),然后取观察加权的均方差的平方根。 (使用DWin的数据框和代码)

t.med <- min(which(cumsum(ev1) >= sum(ev1)*(.5)))
t.sd <- with(dfrm, sqrt(mean(((tm - t.med) * ev1)^2)))

ev1为174031,ev2为41416。

作为替代统计数据,非零测量的比例和非零测量的最长连续运行的长度除以总长度都非常简单,易于理解,并且似乎得到了你的观点。 / p>