这是我之前发布的问题的后续问题: How to use a list of model names and variables for computing a table with its predictions?
我已经正确地应用了帮助(我相信),对此我非常感谢。在另一个数据集(traind2
,testd2
)上,我将提供的代码已经正常运行。但是,当我要在格式相似的新数据集(traind
,testd
)上使用它时,代码会因为遇到错误而停止工作。现在,我想到了两件事
找到一种跳过某些方法中发生的错误的方法,并且
让R
继续运行后续模型的代码。我读过一些帖子,说try()
或tryCatch()
可以解决这个问题,但对我没用。
了解可能是由我的输入数据引起的错误,并学习如何针对几种模型处理这些错误。
理想情况下,我想执行2,但是如果做得太多,我也可以选择1。代码和数据如下:
library(caret)
traind <- data.frame(matrix(c(52200,3,104838,393,1062,5,69,0,0,37914,129,313,5,67,0,0,562,4,8,0,0,0,0,
33100,4,92703,404,1033,6,102,0,0,81148,262,676,4,48,0,0,12646,40,86,0,0,0,0,
-96100,1,95979,412,1023,8,103,0,0,73855,235,597,6,143,0,0,21313,56,139,1,19,0,0,
66600,2,88228,273,709,2,29,0,0,75544,310,793,3,39,0,0,10442,38,98,1,11,0,0,
42700,3,99424,313,776,8,149,0,0,70975,199,511,2,23,0,0,11988,32,91,1,12,0,0,
31100,4,107943,413,1018,3,57,0,0,120197,276,728,8,136,0,0,14400,48,122,1,30,0,0,
-15500,1,98740,338,819,3,39,0,0,85364,219,521,2,26,0,0,12714,57,145,1,11,0,0,
25200,2,78580,303,756,3,57,0,0,104044,212,566,4,61,0,0,15207,52,124,0,0,0,0,
15300,3,77782,258,612,0,0,0,0,129207,231,558,3,56,0,0,14278,71,174,1,13,0,0,
14100,4,80985,316,761,5,87,0,0,289500,363,867,4,60,0,0,9473,38,81,0,0,0,0,
33600,1,92278,425,1070,6,89,0,0,140805,307,743,4,56,0,0,12396,25,74,0,0,0,0), nrow=11, ncol=23, byrow=TRUE))
testd <- data.frame(matrix(c(53900, 2,87725,405,937,2,35,0,0,59717,170,438,1,17,0,0,9730,36,80,0,0,0,0), nrow=1))
traind2 <- data.frame(matrix(c(36000,1,81666,1007,2673,13,196,0,0,87088,848,2317,8,111,0,0,10509,197,510,9,265,0,0,
1000,2,62624,669,1968,18,344,0,0,80222,653,1739,12,165,0,0,8784,141,370,3,51,0,0,
36000,3,71276,617,1745,11,179,0,0,87064,465,1339,13,287,0,0,5135,137,387,2,48,0,0,
19000,4,71045,801,2217,17,296,0,0,100631,815,2076,12,253,0,0,4924,68,154,0,0,0,0,
-2000,1,60774,577,1602,22,415,0,0,86901,611,1628,14,253,0,0,5583,97,216,0,0,0,0,
-6000,2,55308,587,1606,11,211,0,0,80076,615,1626,11,205,1,140,4660,104,273,1,22,0,0,
12000,3,68161,830,2145,13,261,0,0,93617,318,822,4,62,0,0,3753,77,205,0,0,0,0,
-15000,4,56420,768,2273,34,744,0,0,114273,806,1961,7,88,0,0,5780,41,95,1,11,0,0,
-20000,1,52692,715,1889,15,405,0,0,133528,1332,3177,4,57,0,0,3564,40,97,0,0,0,0,
-10000,2,55013,847,2165,3,48,0,0,110475,794,2130,12,227,1,104,120,1,2,0,0,0,0,
-14000,3,49109,289,784,4,76,0,0,123755,595,1529,7,124,0,0,1489,22,65,1,16,0,0,
-18000,4,53575,320,833,6,99,0,0,113915,621,1604,6,134,0,0,2469,26,61,0,0,0,0,
-6000,1,47263,249,725,4,84,0,0,95972,583,1503,7,94,0,0,30710,152,327,1,18,0,0), nrow=13, ncol=23, byrow=TRUE))
testd2 <- data.frame(matrix(c(-1000, 2,43202,326,894,7,181,0,0,100264,740,2125,40,655,0,0,13086,135,334,0,0,0,0), nrow=1))
Modelnames <- c('lm', 'earth', 'bagEarth', 'cubist', 'leapBackward','leapForward',
'icr', 'lars', 'lars2', 'glm','penalized', 'pcr', 'ppr', 'rqlasso',
'rvmRadial', 'foba', 'bagEarthGCV', 'bayesglm', 'BstLm', 'gaussprLinear',
'gaussprPoly', 'glmStepAIC', 'glmnet', 'monmlp', 'svmRadialSigma') # just trying out a lot of things for curiousity
formula1 <- X1 ~ factor(X2) + X3 + X10 + X17 + X5 + X7 + X12 + X14 + X19 + X21
train_control <- trainControl(method = "LOOCV")
modelList <- lapply(Modelnames, function(mod) train(formula1, data=traind,
method=mod,preProcess=c('scale', 'center'), trControl=train_control))
predictions <- sapply(modelList, predict, newdata = testd)
因此,该代码可在traind2
和testd2
上运行,而不能在traind
和testd
上运行。注意:我遇到的错误是
pred中的错误-obs:二进制运算符的非数字参数
我应该如何进行?