我有一些称为dat
的时间序列数据,我想做的是将其分为滚动训练和测试。
说我们总共有100天,我想在前20天训练模型,并在接下来的10天进行测试(因此将30天用于训练和测试)。然后从第2天移至第22天(在20天的训练时间内),然后在接下来的10天(22-32)中进行测试。然后进行相同的操作,但从第3天开始训练直到第23天,然后对接下来的10个观测值进行测试直到33。继续进行下去,直到最终模型在第70天开始并进行训练直到90为止,对最后10个观测值进行测试。 >
我正在努力使更改的天数(即总天数可以为1000、1250、87等)
我有一个功能,可以对一些数据进行逻辑模型训练,但是数据随着天数的增加而扩展,但不完全是我想要的。
如果我可以创建不同的训练和测试分组,那么使用rollapply
函数可能会得到我想要的结果。
编辑:我不确定在前20天进行训练,然后在接下来的1天而不是10天进行测试是否更好/或更有趣。
代码:
myfun <- function(model_len, dat, ...){
dat <- data.frame(dat)
names(dat) <- c("y", "x1", "x2", "x3")
fit <- glm(formula, data=dat[(1:model_len),])
predict(fit, dat[(model_len + 1),])
}
sapply(1:50, myfun, dat=dat)
数据:
dat <- structure(c(0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1,
1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0,
1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1,
0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0,
0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1,
1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0,
1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1,
1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1,
0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1,
0, 1, 1, 1, 1, 1, 1157.4779907, 1161.2739868, 1165.064978, 1162.5039794,
1152.5029784, 1143.5659789, 1131.9999755, 1115.114978, 1101.3089843,
1088.9449828, 1077.7859863, 1067.7619873, 1059.9439942, 1058.2339967,
1062.8999879, 1065.9739869, 1071.7789918, 1084.3059937, 1094.9029908,
1101.5380006, 1106.801001, 1106.7830079, 1105.7230103, 1105.3360108,
1104.5960206, 1104.4260255, 1106.363025, 1109.688025, 1111.763025,
1113.7510255, 1118.2270265, 1126.2330201, 1131.9140137, 1132.8030029,
1133.0679931, 1131.1919921, 1123.4999877, 1109.6529845, 1098.5239806,
1085.2169738, 1070.7239746, 1058.9449829, 1046.018982, 1037.3779847,
1030.1209901, 1023.8139955, 1019.6099977, 1018.9979982, 1016.8410036,
1018.3280031, 1021.1230043, 1020.8710024, 1024.0220033, 1030.0970094,
1034.7910035, 1040.7799927, 1047.371991, 1052.5719849, 1051.4059814,
1051.5269836, 1052.2799865, 1052.3579894, 1050.2929931, 1046.6079956,
1041.8380005, 1035.4400025, 1032.9650025, 1031.6990113, 1035.0920167,
1041.2500184, 1047.0030091, 1053.8240052, 1062.1109986, 1066.3029907,
1072.0419922, 1077.5289917, 1079.3439941, 1081.8229858, 1083.4049804,
1083.0979735, 1081.2649779, 1079.0049803, 1075.0169798, 1073.8739867,
1074.1959837, 1078.2869871, 1085.5799925, 1091.5880003, 1098.3030028,
1102.7200072, 1106.8830077, 1112.3160033, 1120.2160033, 1126.9150023,
1133.6280028, 1136.9040038, 1140.320996, 1143.1609985, 1146.4569946,
1149.8369995, 1153.297998, 1152.7800049, 1150.6940064, 1147.6130005,
1143.8229981, 1140.1619995, 1135.5619995, 1129.0449951, 1124.4880005,
1122.7390015, 1122.5960084, 1125.3989991, 1128.9430054, 1136.8930054,
1144.3530029, 1151.173999, 1158.3080078, 1167.6070068, 1173.8760009,
1178.3499999, 1183.494995, 1193.018994, 1203.9989867, 1212.4839843,
1217.4519897, 1221.0399902, 1222.8859863, 1225.2989868, 1229.2179931,
1233.0979858, 1235.0249878, 1234.4389893, 1232.6299927, 1230.7069947,
1230.6179932, 1232.1449952, 1234.6289918, 1234.0659913, 1232.0999879,
1229.8249879, 1228.1249879, 1224.0649903, 1220.2369874, 1215.8649903,
1214.1689942, 1214.8499878, 1213.7549926, 1217.246997, 1220.5099975,
1222.2329955, 1221.1559935, 1219.641992, 1216.0529905, 1211.9979856,
1206.3969847, 1199.9509886, 1193.1179808, 1185.7209715, 1179.0619749,
1172.8479857, 1169.2699828, 1167.7309814, 1169.2739868, 1169.3999878,
1170.2729858, 1171.0019897, 1172.7689941, 1174.7, 1176.7939942,
1180.7199952, 1184.6089966, 1187.7949951, 1185.9269897, 1185.0529907,
1182.6129883, 1178.0299805, 1168.1029786, 1156.5709717, 1148.2319702,
1137.9259643, 1130.0429687, 1121.3169677, 1113.2949707, 1107.2059692,
1102.4249755, 1098.911975, 1095.860974, 1097.485974, 1093.6249755,
1086.4079772, 1077.9009704, 1074.0089783, 1072.2119812, 1068.344989,
1062.2379822, 1057.449994, 1061.7179994, 1060.4010072, 1059.8690125,
1061.7240113, 1061.7080201, 1058.3970215, 1057.8680176, 1058.2380127,
1056.2290161, 1053.2240112, 1047.6460082, 1041.7940063, 1040.0410034,
1040.6190063, 1045.6369994, 1050.1010009, 1128.81199335, 1132.72894074524,
1136.05951315045, 1133.75860942184, 1126.33398461976, 1121.97836475121,
1114.98804010824, 1104.18156200269, 1097.85760647863, 1093.48449548066,
1089.54311267298, 1087.65328775174, 1087.83107177539, 1088.49478389202,
1089.82480075944, 1091.87386411569, 1093.27921086657, 1096.47071830785,
1100.97350704044, 1102.6227005604, 1102.82339384036, 1099.6516439508,
1097.67720586025, 1097.0346199688, 1096.8465665432, 1098.06499020575,
1100.72546732901, 1106.37447415482, 1111.91023852103, 1114.41117237617,
1117.75201214987, 1120.7832448975, 1122.20674347869, 1120.07466752834,
1117.94469547802, 1115.36710590868, 1109.05404401262, 1100.7222309638,
1096.19725287201, 1087.52132174134, 1079.62024328978, 1075.06498573838,
1068.53212719186, 1063.28239822121, 1059.64979029538, 1056.61743493392,
1051.89577236878, 1048.42474757175, 1046.82620161254, 1044.26846536373,
1043.14861247194, 1041.82684176033, 1041.46047397363, 1044.57471778567,
1047.19426428227, 1051.05194873158, 1053.13842609047, 1054.50142846281,
1051.21367146635, 1048.35332113622, 1047.56157998039, 1045.89381512512,
1043.17345339892, 1042.61503488473, 1040.8783653719, 1039.24423257458,
1040.09811147224, 1041.49734266536, 1042.67950374485, 1046.49669481677,
1051.36081397707, 1055.8274040745, 1060.05336092454, 1061.8797055984,
1063.77402125569, 1065.18506361229, 1065.29696088731, 1066.65724613614,
1066.94988745651, 1068.16322588922, 1069.21815580453, 1069.83166801363,
1068.92578972661, 1068.81857632408, 1070.35871095988, 1075.03883372561,
1081.15799613269, 1086.72961878672, 1091.50584604513, 1094.58719261226,
1097.09031664919, 1100.22361887307, 1103.94707859945, 1106.8845033995,
1111.19264545669, 1115.10382303224, 1120.66155045774, 1125.17569412844,
1129.42943430668, 1132.1180628489, 1134.34300733948, 1133.43510749763,
1132.00890306928, 1129.33948182459, 1127.89952841272, 1126.73290894484,
1126.80215199772, 1124.52480561698, 1124.50054032013, 1125.99287400392,
1128.66498590831, 1130.96736496466, 1133.15142772993, 1137.94462318423,
1142.78989202382, 1146.70132945013, 1151.6631122644, 1155.87424490588,
1158.8347892958, 1161.3181459343, 1165.5259415596, 1173.38822864916,
1181.98934506353, 1190.21226039081, 1194.81109273454, 1197.18527342649,
1199.09715310016, 1201.08885375729, 1203.47563187564, 1205.40271083986,
1207.24721647416, 1210.57795500043, 1213.91433880992, 1217.26535187564,
1219.20293598272, 1220.70837160341, 1222.74566726023, 1221.94893752116,
1220.47665680486, 1218.61792387106, 1217.58479016906, 1216.06433348629,
1215.23248801141, 1214.29415629603, 1214.89947702975, 1217.46333121739,
1218.76682576811, 1221.6747517902, 1223.33620352446, 1222.84608328404,
1220.3845515427, 1217.15554472911, 1212.80167770729, 1208.2329423066,
1204.08123494406, 1201.53635399701, 1197.84907704491, 1195.70439885016,
1193.49731600729, 1189.93090962564, 1187.19653451844, 1185.66257561192,
1185.77756793459, 1183.90255822654, 1182.89945696687, 1183.06617763669,
1182.8208264332, 1183.94646343956, 1184.8534641596, 1185.84933033488,
1187.20748792203, 1188.70677011993, 1186.75278639422, 1183.95251873763,
1180.62084752452, 1176.63980928409, 1167.55220563799, 1159.14913329151,
1154.47587831137, 1148.54960418648, 1145.95250178776, 1143.07035314131,
1137.82269769928, 1133.88338944221, 1130.76687940009, 1128.18812336199,
1120.80925075608, 1118.40550744598, 1113.93545635589, 1104.9968430839,
1098.44571145686, 1096.38135988954, 1093.86884942387, 1090.43277224064,
1085.63821926534, 1082.79744209722, 1083.80625856415, 1083.6723314628,
1082.00354027587, 1077.87272739245, 1073.8896151646, 1071.01060743464,
1070.41054586943, 1069.56096911996, 1064.84087682282, 1061.11888950636,
1058.87994622004, 1055.5466184848, 1054.88694005768, 1053.88913948076,
1056.96921953021, 1059.95310805114, 77.1228859956622, 81.0362538530292,
78.8404654349793, 46.4728298378735, 33.7103494024937, 38.1634534707235,
33.5520386736078, 26.2429467891094, 30.5979953728327, 30.5979953728327,
31.2223518673486, 33.7665461425831, 36.6962580582319, 37.7398082531122,
40.5860776927095, 41.0627097257687, 40.7556533339627, 52.526559398101,
67.2093345204357, 57.3558861837519, 61.809628052695, 65.0522479908148,
60.3356537763659, 59.9025026642582, 60.6951031882524, 60.0950548232381,
59.3846485649388, 64.6199416069941, 64.1051430716001, 55.6515339908006,
58.7835089189351, 55.0890845598537, 48.1838706704649, 46.0064642542491,
48.4030879681908, 55.5793562399467, 43.3339041496164, 35.5089178322478,
42.157901440901, 32.5975281088021, 28.6602735068277, 26.9110067493817,
23.5372731683978, 27.6575715257538, 27.7636741048428, 28.4241344813052,
27.7437779358905, 33.8748748481366, 38.0173561927228, 37.3614293051309,
46.7027642395441, 51.6960358269122, 46.2684476430283, 67.9712504992444,
67.4307596718059, 65.3539239654913, 69.3859268680975, 65.8884694613497,
48.7463489665683, 48.3776103610145, 58.1513743683333, 53.5784372311078,
46.4319595892114, 54.1515204375632, 48.0571628692748, 48.6571396623733,
52.2995925118996, 44.9774509790143, 45.2591195805464, 48.7943143049565,
56.0044804919092, 57.6982718090011, 75.947686211121, 66.6475291255686,
63.2031704734223, 66.0494138822722, 66.2641524590373, 64.6800962380417,
66.0941051628946, 68.6330617447997, 62.298871330898, 58.4734193157287,
52.329016147723, 43.5650542408412, 44.6973713488007, 56.9666746925596,
61.477502601121, 70.1850582389349, 68.3785649248245, 64.1672444920065,
68.1060250901431, 67.2130080618559, 73.8468747118516, 69.6113702464934,
73.1570958144156, 74.8830412236628, 85.4049570826199, 81.7882678868151,
79.8159292966814, 65.9053697697576, 57.9091367119927, 44.4025529377091,
43.2388424796772, 42.7803356293289, 47.7057738515549, 44.7755737074884,
45.7557906780512, 40.016244653124, 41.4992896665767, 46.6336286507843,
44.3657650232027, 45.4718259236287, 45.2372613787558, 56.9881807801438,
58.8717301068573, 68.2039283244873, 73.5215112680329, 78.8594307629251,
73.0335410836162, 71.845824268758, 73.323376014074, 89.1748677280385,
88.8275948061702, 88.079358554904, 72.9197089804835, 66.5774741060939,
65.5905607795046, 60.3560855296636, 60.5351059532554, 61.4085229097936,
58.076745639994, 63.2173375817626, 67.2733875032827, 68.7459719049055,
59.9037653356146, 44.6491666372171, 40.4929666577831, 30.2655738215587,
36.0522832244009, 40.7505784647263, 45.517250253278, 41.5835266382263,
41.3526668380199, 41.539756712543, 48.3189167794286, 49.8415866657383,
44.5858982397584, 50.0675010891207, 50.5139938354098, 44.9097955003298,
37.4247186375495, 41.3952548987526, 39.6467050713014, 39.3953595896288,
36.8289128008105, 42.8772642627352, 37.5760511024063, 42.0791664435174,
36.4236440580649, 25.1434697637668, 29.0666072154372, 25.3668839063101,
34.1040319281821, 34.1351918720353, 42.138526061446, 49.3942545777117,
53.2282422165058, 60.0907410718325, 59.6946479180297, 56.5126081396889,
64.5584522103826, 61.6638469740838, 48.5567687748239, 50.4491176695018,
45.8595330253583, 39.1134283844586, 22.2017732449298, 24.6509068125481,
33.7409449463083, 27.0354908046699, 36.9033514343542, 31.849732552439,
28.384694400023, 30.2843907497844, 30.2566110685775, 30.1702095862,
28.1229085893699, 39.7891005017724, 37.8236546439287, 33.4844836408483,
42.9231744072258, 49.6425369989148, 43.9761986844232, 44.7318583977582,
37.1424843378588, 40.8120228103859, 50.807226927847, 47.9214803669887,
44.995279725301, 41.3197867616665, 47.7401787161256, 40.9599257198947,
48.8101085201251, 58.7773921954413, 46.8976151314924, 38.7370234461344,
43.0052200556536, 42.7247275761847, 51.7764243779359, 47.5063348907638,
48.4623219235214, 51.3175593621287), class = c("xts", "zoo"), .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC", src = "yahoo", updated = structure(1544977543.47594, class = c("POSIXct",
"POSIXt")), index = structure(c(1517356800, 1517443200, 1517529600,
1517788800, 1517875200, 1517961600, 1518048000, 1518134400, 1518393600,
1518480000, 1518566400, 1518652800, 1518739200, 1519084800, 1519171200,
1519257600, 1519344000, 1519603200, 1519689600, 1519776000, 1519862400,
1519948800, 1520208000, 1520294400, 1520380800, 1520467200, 1520553600,
1520812800, 1520899200, 1520985600, 1521072000, 1521158400, 1521417600,
1521504000, 1521590400, 1521676800, 1521763200, 1522022400, 1522108800,
1522195200, 1522281600, 1522627200, 1522713600, 1522800000, 1522886400,
1522972800, 1523232000, 1523318400, 1523404800, 1523491200, 1523577600,
1523836800, 1523923200, 1524009600, 1524096000, 1524182400, 1524441600,
1524528000, 1524614400, 1524700800, 1524787200, 1525046400, 1525132800,
1525219200, 1525305600, 1525392000, 1525651200, 1525737600, 1525824000,
1525910400, 1525996800, 1526256000, 1526342400, 1526428800, 1526515200,
1526601600, 1526860800, 1526947200, 1527033600, 1527120000, 1527206400,
1527552000, 1527638400, 1527724800, 1527811200, 1528070400, 1528156800,
1528243200, 1528329600, 1528416000, 1528675200, 1528761600, 1528848000,
1528934400, 1529020800, 1529280000, 1529366400, 1529452800, 1529539200,
1529625600, 1529884800, 1529971200, 1530057600, 1530144000, 1530230400,
1530489600, 1530576000, 1530748800, 1530835200, 1531094400, 1531180800,
1531267200, 1531353600, 1531440000, 1531699200, 1531785600, 1531872000,
1531958400, 1532044800, 1532304000, 1532390400, 1532476800, 1532563200,
1532649600, 1532908800, 1532995200, 1533081600, 1533168000, 1533254400,
1533513600, 1533600000, 1533686400, 1533772800, 1533859200, 1534118400,
1534204800, 1534291200, 1534377600, 1534464000, 1534723200, 1534809600,
1534896000, 1534982400, 1535068800, 1535328000, 1535414400, 1535500800,
1535587200, 1535673600, 1536019200, 1536105600, 1536192000, 1536278400,
1536537600, 1536624000, 1536710400, 1536796800, 1536883200, 1537142400,
1537228800, 1537315200, 1537401600, 1537488000, 1537747200, 1537833600,
1537920000, 1538006400, 1538092800, 1538352000, 1538438400, 1538524800,
1538611200, 1538697600, 1538956800, 1539043200, 1539129600, 1539216000,
1539302400, 1539561600, 1539648000, 1539734400, 1539820800, 1539907200,
1540166400, 1540252800, 1540339200, 1540425600, 1540512000, 1540771200,
1540857600, 1540944000, 1541030400, 1541116800, 1541376000, 1541462400,
1541548800, 1541635200, 1541721600, 1541980800, 1542067200, 1542153600,
1542240000, 1542326400, 1542585600, 1542672000, 1542758400, 1542931200,
1543190400, 1543276800, 1543363200, 1543449600, 1543536000), tzone = "UTC", tclass = "Date"), .Dim = c(212L,
4L), .Dimnames = list(NULL, c("y", "x1", "x2", "x3")))
编辑: 只是让我对函数的输出有些不了解。
我设置:
, n_train = 5
, n_test = 1
并获得以下最后3个输出:
[[203]]
2018-11-16 2018-11-19 2018-11-20 2018-11-21 2018-11-23 2018-11-26
1.00045650 0.08862828 0.61874897 1.00620776 0.67800147 0.60795702
[[204]]
2018-11-19 2018-11-20 2018-11-21 2018-11-23 2018-11-26 2018-11-27
0.05759443 0.69372082 0.93025186 0.72564291 0.60694731 0.98584268
[[205]]
2018-11-20 2018-11-21 2018-11-23 2018-11-26 2018-11-27 2018-11-28
0.8507988 0.8028078 0.7412901 0.6416496 0.9538837 1.0095700
这些是事件发生的预测概率吗?我们如何将1.0095700
作为概率之一?
第二次,因为n train = 5且n test = 1,所以最后一个输出告诉我,前5个结果是训练数据上的预测概率,第6个结果是测试数据上的预测概率,即数据2018-11 -28 = 1.0095700 ?,结果204相同,2018-11-27 = 0.98584268。
答案 0 :(得分:1)
我不确定您打算如何使用这种功能,但是您可以将一些代码包装在一个额外的函数中,在该函数中计算训练和测试索引。例如,像这样
public static void main(String[] args) {
MyPair p = new MyPair(5, 6);
System.out.println(p);
p.swap();
System.out.println(p);
}
您可以进一步优化此代码。
出于测试目的选择1到10个时间段很大程度上取决于应用程序。
HTH