如何计算滚动平均相关性

时间:2014-04-09 10:48:26

标签: r statistics

假设您有N个时间序列(xts类)

您能否建议一种方法(例如现有函数)来计算滚动平均相关性(滚动=移动窗口)?

所以你有(例如)10个时间序列。 第一步是计算第一和第二,第一和第三,第一和第四之间60天的相关性,依此类推...... 第二步是计算该相关值的平均值。

第一个周期结束。

提前一天并开始所有过程(第一步和第二步)

结果是一个具有平均相关值的时间序列。

任何人都可以帮忙找到一种有效的方法吗?

谢谢!

这是我的数据结构:

structure(c(0.00693323784940425, 0.00119688823384623, 0.00413204756685159, 
0.00794053366741787, -0.00885729207412611, -0.0103255273426481, 
0.00526375949374813, 0.00367934409948933, -0.000445260763187072, 
0.00533008868350748, -0.0184988649053324, -0.00141119382173205, 
0.00912118322531175, 0.00260087310961143, -0.00517324445601819, 
0.000811187375852285, -0.0116665921404522, -0.00343004480926279, 
0.0120294054221377, -0.00215680014590536, 0.0168071183163816, 
0.00708735182246834, -0.0059733229016512, -0.0158720766901048, 
0.00624903406443433, 0.0027648628898902, -0.0017585734967982, 
0.0101320524039767, -0.0135954228883302, 0.000315347989116255, 
0.00954752335550202, -0.00916386710679085, 0.0133360711487689, 
0.00791710073166163, -0.00867438967357037, -0.0137301928119018, 
0.0139960297273252, 0.0117445218692636, 0.000686577438573366, 
0.0095629144062328, -0.0095629144062328, -0.0110422101956824, 
0.00400802139753909, 0.00319489089651892, 0.00238948739738154, 
0.00396983451091115, -0.010354532975581, -0.000800961196204764, 
0.00640343703520729, 0.00530505222969291, 0.000528960604769591, 
0.00211304885806918, -0.00901145774831846, -0.00266595732411989, 
0.016839280413353, 0.010194537979594, -0.00489550968298724, -0.00340329313170784, 
-0.0102799306197494, 0.0208301415149017, -0.000578168926731237, 
-0.000355597704215782, 0.000237079185558819, 0.000829334802584292, 
5.92118897646543e-05, 0, 0.0061306105275456, 0.0018738394950697, 
0.0011129482176262, 0.00604309135743897, -0.0124473568664056, 
-0.00649453341986561, 0.0103155526698018, 0.00357355949086502, 
-0.00357355949086502, 0.00666034812253669, -0.0138460077834108, 
-0.0155041865359653, 0.00548408883420493, 0.00733525242247035, 
0.00125208697492907, -0.0128031972436093, -0.0146826767924852, 
0, 0.00593340671593001, 0.00356546338719443, 0.00643017736636065, 
-0.00365347763152091, -0.0168898372113038, 0, 0.0070456351632, 
0.00699634129248716, 0.00150630794815321, -0.0115433205305631, 
-0.014377703821594, 0, 0.0117600151966468, 0.000543625998710162, 
-0.00490330592852084, -0.0193002958123656, -0.00782564083139015, 
0, -0.00162696142802687, 0.00116238534533863, 0.001161035774218, 
-0.00325430319748232, 0.000930882077925688, 0, -0.00701927122582013, 
-0.00145843487202635, 0.00315725823897228, 0.0053204478588742, 
-0.00168980124699214, 0, 0.00622099240950913, 0.00449248477550324, 
-0.00220133862496308, -0.0167525285370109, -0.0100485946017672, 
0, 0.0138102547827188, 0.006682892429688, -0.00485585172657022, 
-0.0167194630182061, -0.0196819849217924, 0, 0.00199401860686432, 
0.00567538413259872, -0.000566091155790538, -0.00198384647748195, 
-0.00826097847094331, 0.00342661671664768), .indexCLASS = c("POSIXct", 
"POSIXt"), tclass = c("POSIXct", "POSIXt"), .indexTZ = "GMT", tzone = "GMT", class = c("xts", 
"zoo"), index = structure(c(1396310400, 1396396800, 1396483200, 
1396569600, 1396828800, 1396915200), tzone = "GMT", tclass = c("POSIXct", 
"POSIXt")), .Dim = c(6L, 22L), .Dimnames = list(NULL, c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20", "21", "22")))

1 个答案:

答案 0 :(得分:1)

假设在前十个变量中,数据框中的所有系列都称为X.然后:

sapply(1:(NROW(X)-59), function(U) mean(cor(X[U:(U+59), 1:10 ])))

如果您没有在数据框中使用它们,那么我认为最简单的方法是首先制作数据框:) - 前提是您的时间序列长度相同。

X <- data.frame(X1=ts1, X2=ts2, .... etc)

(编辑)

要从相关矩阵中排除对角线1,您可能首先定义一个函数来计算对角线以下所有值的平均值(或者高于诊断值,不会产生影响):

meanLT <- function(x) mean(x[lower.tri(x)])
sapply(1:(NROW(X)-59), function(U) meanLT(cor(U:(U+59), 1:10])))

(未经测试,但我认为是shoudlwork)