插入数据分裂错误

时间:2015-12-07 13:47:58

标签: r r-caret data-management

尝试使用插入符号包分割数据时,我一直遇到问题。

MDRab.na=structure(list(DEV.rabbit.Bi = c(0L, 1L, 0L, 0L, 0L, 0L, 1L,1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L,1L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 1L,0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L,1L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L,0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L,1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L,0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 0L,1L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L,1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L,0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L,1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L,0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L,0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L,0L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L,0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 1L,0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 1L,1L), cytoP = c(0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, -2.648429, 0, 0, -2.1260048,0, 0, 0, 0, 0, 0, 3.96295424, 0, 0, 0, NA, 2.83428136, 1.960805564,0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, -3.126005, 0, 0, 0, 0, 7.0318728,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0, 0, 0, NA, 3.024976,0, 0, 0, 0, -2.33067135, 0, 0, 0, -3.2528685, NA, 0, 0, 0, 0,-3.3048862, 3.2453672, 0, NA, 0, -3.9118235, NA, 0, 0, 0, 0,0, -3.3074869, 0, 0, 0, 0, 0, -2.12893162, NA, 0, NA, 0, 0, -3.64705195,0, 0, 0, 0, -2.6801575, 0, -2.32687549, 0, -2.135834975, 0, -3.38015,0, 0, 0, NA, -3.47037585, 0, -2.4122793, 0, 0, 0, NA, 0, 0, 0,3.7564152, 0, 0, -2.434712735, 2.86341288, 0, 0, 0, 0, 0, 0,0, -2.53032038, 0, 0, 0, -3.73306513, 0, 0, -2.4050405, 0, 0,0, -2.38829672, 0, 0, 0, -0.823873667, 0, 0, 0, -2.24985834,0, 0, 0, 0, 0, 0, 0, -2.2202064, 0, -2.34696895, NA, NA, 0, -2.15253385,-2.1856675, -2.2366473, 2.017460955, -2.96851445, 0, 0, 0, -3.0842214,0, -3.50124325, -5.794065, 0, 0, NA, 0, 0, -3.1539793, -2.5736979,0, 0, -3.3703135, -2.3865695, 0, 0, -2.710736745, 0, -0.743292433,4.383613097, 0, 2.373366367, 0, 0, 0, -2.75693455, -2.3156743,0, NA, 0, NA, -3.144766, NA, -2.61448215, NA, 0, 0, 0, 0, -2.2124975,0, 0, 0, 0, 0, 0, 0, 0, -3.053354, NA, 5.428529647, -2.9443965,-3.8878643, -2.2083998, 0, 0, 0, NA, 0, NA, -2.13583495, 0, 0,0), GIP = c(0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3.5384006,0, 0, 0, 0, 0, 0, 0, 0, 0, 5.820918, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA,0, 0, 0, 0, 3.516466, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 3.73598124,0, 0, 0.88683115, 4.588133, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 2.0566821, 0, 0, 0, 0, 0, 0, 0, 4.31335206,NA, 0, 0, 0, 0, 0, 0, 0, 0, 8.6651012, 0, 2.55087375, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 3.068526045, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 6.3948068,0, 0, 0, 0, 0, 0, 0, 0, 0, 3.3290915, 3.205779325, 0, 0, 0, 0,0, 0, 0, 0, 2.8614185, 0, 0, 2.045684, 0, 1.01417725, 0, 2.959137,0, 1.35015685, 0, 0, 0, 0, 0, 0, NA, 4.900614, 1.290875, 0, NA,0, NA, 0, 1.4537355, 0, 0, 0, 3.1133642, 0, 0, 0, 6.168443, 0,6.26968469, 3.872625, 0, 3.890076867, 0, 3.1133642, 2.250768067,0, 0.97301535, 4.8966569, 0, 8.487644, 0, 3.798781, 3.253654875,4.960366, 0, 2.3501405), neuroP = c(0, NA, NA, 0, 0, 0, 0, 0,0, 2.0428646, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11.03703,NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5.165785, 0, 0, 0, 0,0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 3.583922,0, 0, 0, 0, 0, 0, 0, 0, 2.0009107, 0, 0, 0, 0, NA, NA, 0, 0,0, NA, 0, 0, 0, 0, 2.55936099, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0,0, 0, 0, NA, 2.5078381, 0, NA, 0, 3.872625, 0, 0, 0, 0, 0, 0,3.97424399, 0, 0, 0, NA, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 2.5064081, 0, NA, 0, 0, 0, 0, 0, 0, NA, 0, 0,0, 0, 0, 0, 0, 0, 2.16196, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 2.317407, 0, 0, 0, 0, 1.929947, 0, 2.000911, 0, 0, 0, 0, 0,0, 0, 0, 0, 2.247053, 0, 3.191807, 0, 0, 0, NA, NA, 0, 0, 0,1.9766362, 2.126448, 0, 0, 0, 0, 4.130221, 0, 0, NA, 0, 0, NA,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2.599616, 0, 0, 0, 0, 0, 0,0, 0, 0, 4.6628686, NA, 0, NA, 0, NA, 0, NA, 0, 0, 0, 0, 0, 3.0913634,0, 0, 4.6432279, 4.586727, 0, 1.58651903, 0, 2.6652475, NA, 0,0, 0, 3.5208109, 4.2195317, 0, 0, NA, 10.5157265, NA, 0, 0, 2.8920614,7.039145), ProlifP = c(0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 5.263517193, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0,-1.945246, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0, 5.307669467, -11.05227, NA,0, 0, 0, 0, 0, 3.562687467, 0, 0, 0, 0, NA, 0, 0, 2.81177124,0, 0, -2.02585, 3.887923007, NA, 0, 0, NA, 0, 0, 0, 0, 0, 3.7865502,0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,-2.12833253, 0, 4.947180667, 0, 0, 0, NA, 4.128478133, 0, 0,0, 0, -2.04286463, NA, 0, -2.0343177, 4.217210867, -2.068998,0, 0, 5.591507567, 0, -2.0868461, 0, 0, 0, 0, 0, 0, 0, 0, 0,5.151728643, 4.936735813, 0, 0, 0, 0, 0, -2.562395, 0, -2.009148,-7.564251, 0, 0, 0, 0, 0, 0, 0, 0, 0, -2.346905, 3.207918667,0, 0, 0, 0, -2.9254072, NA, NA, 0, 0, -2.5948795, 0, -2.060203,0, -4.14739583, -2.8027302, -4.487039, 0, 0, 0, NA, -2.8964375,0, 5.003374, -2.263317, 0, 0, 3.609647733, -2.6806902, 0, 3.491692,3.505242133, 0, -3.000911, 3.120921753, -3.445611, 0, -4.397199,0, 5.147579867, 0, 0, 2.005820067, 0, -0.680954867, -3.0411488,NA, -1.885536, NA, -2.060203, NA, -3.2384957, NA, 0, 0, 0, 0,-2.798781, -1.6022584, 0, 0, 0, 0, 0, 0, 0, 4.713909533, 4.4782686,-5.831885, 5.6344196, -6.8794451, 4.888960867, -3.1387679, -5.5994579,0, NA, 0, NA, 0, 0, -4.6923589, -4.767982), reproP = c(0, 0.58783785,0.1486486, 0.018018017, 0, 0.018018017, 0, 0.063063067, 0.418918933,0.040540533, 0, 0.4864865, 0.018018017, 0.0855856, 0.018018017,0.3963964, 0, 0, 0.454954967, 0.0990991, 0.333333333, 0.036036033,0, 0, 0, 0.36036036, 0, 0, 0, 0.076576567, 0, 0.081081067, 0.049549533,0.3873874, 0, 0, 0.049549533, 0, 0.15, 0.0810811, 0.06756755,0.022522523, 0.0617284, 0.1756757, 0.3963964, NA, 0.383333333,0.018018017, 0, 0.15, 0.4099099, 0.031531533, 0.3918919, 0.058558567,0.0810811, 0, 0, 0.067567567, 0, 0, 0, 0, 0, 0, 0.06756755, 0,0.516666667, 0.481981967, NA, 0.058558567, 0.1621622, 0.027027025,0.040540533, 0.2567568, 0.018018017, NA, NA, 0.1419753, 0.427927933,0, 0, 0, 0.027027025, 0.054054067, 0.040540533, 0.018018017,0.441441433, 0.031531533, 0.10135135, 0, 0, 0, 0, 0.08783785,0.024691357, 0.1126126, 0.072072067, 0, 0.35802469, 0.0472973,0.040540533, 0.063063067, 0.16216215, 0.083333333, 0.333333333,0.018018017, 0.024691357, 0.0945946, 0.0945946, 0.045045033,0, 0.037037035, 0, 0.081081067, 0.35135135, 0, 0.135135133, 0.5,0.058558567, 0.081081067, 0.031531533, 0, 0.1, 0, 0.013513513,0.063063067, 0.333333333, 0.35802469, 0.1081081, 0.018018017,0.040540533, 0, 0.018018017, 0.081081067, 0, 0.075, 0.076576567,0.045045033, 0.067567567, 0.040540533, 0.031531533, 0.027027027,0.06756755, 0.058558567, 0.031531533, 0.081081067, 0.5, 0.036036033,0.45, 0.018018017, 0.040540533, -0.7265556, 0.031531533, 0.4144144,0.1036036, 0.10185185, 0.3873874, 0.067567567, 0.018018017, 0,0.040540533, 0.018018017, 0.027027025, 0.0990991, 0.1036036,0.4, 0.027027025, 0.054054067, 0.2, 0.018018017, 0, 0, 0.033333333,0, 0.031531533, 0.378378367, 0.130630633, 0.018018017, 0.1, 0,0, 0.1, 0, 0, 0.333333333, 0.054054067, 0.459459467, 0.031531533,0.075, 0.5, 0.364864867, 0.031531533, 0.06756755, 0.081081067,0.6418919, 0.1036036, 0.35135135, 0.054054067, -0.931616333,0.3918919, 0, 0.0855856, 0.1081081, 0.373873867, 0.369369367,NA, 0.333333333, 0.040540533, 0.0990991, -1.345913467, 0.040540533,0.018018017, 0.533333333, 0.081081067, 0.3963964, 0, 0.018018017,0, 0.0900901, 0, 0.2027027, 0.031531533, 0, 0.3963964, 0.369369367,0.364864867, 0.1081081, 0.036036033, 0.0743243, -1.1009885, 0,0, 0.55, -0.673395133, 0.06756755, 0.2162162, NA, -0.316663167,0.031531533, 0, 0.031531533, 0.3873874, 0.0608108, 0.045045033,0, -1.004574, 0.018018017, 0, 0.4144144, 0.55405405, 0, 0.1036036,-1.646125933, -1.5806603, -0.9572768, -0.818359433, -0.984343,0.2, -4.2037963, 0, -1.2499105, 0.4, 0.0608108, 0)), .Names = c("DEV.rabbit.Bi","cytoP", "GIP", "neuroP", "ProlifP", "reproP"), row.names = c(2L,4L, 6L, 8L, 11L, 12L, 13L, 15L, 23L, 24L, 25L, 26L, 27L, 28L,29L, 30L, 34L, 35L, 38L, 39L, 40L, 42L, 43L, 44L, 48L, 54L, 55L,56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 65L, 70L, 71L, 72L, 74L,75L, 76L, 79L, 80L, 81L, 82L, 84L, 85L, 86L, 87L, 89L, 91L, 92L,93L, 94L, 97L, 100L, 101L, 102L, 105L, 112L, 115L, 117L, 118L,119L, 120L, 121L, 122L, 126L, 128L, 129L, 130L, 131L, 132L, 135L,136L, 141L, 144L, 147L, 148L, 149L, 151L, 152L, 154L, 155L, 156L,163L, 164L, 165L, 166L, 168L, 169L, 170L, 171L, 174L, 178L, 179L,180L, 181L, 183L, 184L, 186L, 188L, 190L, 191L, 193L, 194L, 198L,199L, 200L, 201L, 202L, 205L, 206L, 210L, 212L, 215L, 216L, 217L,222L, 223L, 224L, 225L, 226L, 228L, 229L, 230L, 231L, 232L, 233L,235L, 236L, 238L, 239L, 241L, 244L, 246L, 248L, 249L, 250L, 252L,253L, 254L, 255L, 257L, 258L, 260L, 262L, 263L, 265L, 268L, 271L,272L, 275L, 276L, 279L, 281L, 282L, 284L, 285L, 286L, 287L, 289L,290L, 291L, 293L, 294L, 301L, 302L, 303L, 304L, 305L, 307L, 309L,310L, 311L, 312L, 314L, 315L, 317L, 319L, 322L, 323L, 324L, 325L,326L, 327L, 329L, 331L, 333L, 334L, 335L, 338L, 339L, 342L, 343L,344L, 346L, 349L, 350L, 352L, 353L, 354L, 356L, 358L, 359L, 360L,361L, 363L, 365L, 366L, 368L, 369L, 370L, 371L, 373L, 374L, 376L,377L, 379L, 380L, 381L, 382L, 384L, 385L, 387L, 390L, 392L, 393L,394L, 395L, 396L, 398L, 399L, 400L, 401L, 402L, 403L, 408L, 409L,414L, 415L, 417L, 418L, 419L, 420L, 421L, 422L, 424L, 425L, 426L,427L, 428L, 429L, 434L, 437L, 438L, 441L, 442L, 443L, 444L, 448L,451L, 453L), class = "data.frame")


library(caret)
trainIndex= createDataPartition(MDRab.na$DEV.rabbit.Bi, p=0.8, list=FALSE, times=5)

MDRab.train= MDRab.na[trainIndex,]
MDRab.test= MDRab.na[-trainIndex,]

但是当我进入查看测试数据时,我发现没有数据。我究竟做错了什么?最终,我想获得5倍交叉验证集的特异性和灵敏度。如果有人可以帮助我完成这个过程中的步骤,那将是很好的,但对这个问题的简单回答将非常有帮助。下面是CV的其余代码(在我看来)。(/ p>)

train.lda=lda(MDRab.train[,1]~MDRab.train[,2]+MDRab.train[,3]+MDRab.train[,4]+MDRab.train[,5]+MDRab.train[,6])
pred= predict(train.lda, newdata=MDRab.test)$class
sensitivity(pred, MDRab.test[,c(1)])

任何帮助将不胜感激!

1 个答案:

答案 0 :(得分:1)

问题在于您创建datapartition。您使用时间5创建212乘5矩阵。如果您只是在MDRab.train中插入它,您将获得1060条记录的训练集,而不是212条记录中的一条。 还会发生的是,所有记录都存在于trainIndex矩阵中,这导致测试集中有0条记录。

如果您想使用这种分割数据的方式,则需要创建5个训练集。

请参阅下面的代码,了解第一个训练集。

MDRab.train1 <- MDRab.na[trainIndex[, 1],]
MDRab.test1 <- MDRab.na[-trainIndex[, 1],]

nrow(MDRab.train1)
nrow(MDRab.test1)