使用多个二项式目标变量在插入符号中运行train()时createFolds cut.default中的错误

时间:2015-04-29 01:16:35

标签: r r-caret

此问题与train() in caret package returns an error about names & gsub具有相同的症状,但就我所见,所描述的解决方案并不适用于此处。

我有8个二项式目标变量和12个预测变量(实际上,577个预测变量,但我在这个最小的例子中包括了12个)。所有预测变量都具有相同数量的阳性病例:

> require(caret)
> head(eg.data)
  bottle cat chair face house scissors scrambledpix shoe      X1
1      0   0     0    0     0    1        0    0 1.282427535
2      0   0     0    0     0    1        0    0 2.580423598
3      0   0     0    0     0    1        0    0 2.757994797
4      0   0     0    0     0    1        0    0 2.027544189
5      0   0     0    0     0    1        0    0 2.011910591
6      0   0     0    0     0    1        0    0 1.381372427
          X2         X3         X4      X5      X6
1  0.56927127535 -0.41445500589  0.05883449623 1.325428161 3.009461590
2  0.99142631615 -0.29943837061  0.07639494922 1.523704820 2.827368769
3  2.03652352150 -0.17050305555 -0.31151493933 1.573253408 2.678044808
4  1.25721256063 -0.13619253754  0.51253133255 2.577229617 1.928547094
5 -0.08773097125  0.06366970261  0.39996831088 1.887088568 1.946206958
6 -0.25631254599 -0.02384295467  0.46782728851 1.200404398 1.325037590
         X7        X8        X9        X10
1 0.06590922936  0.6734459904 -0.5028127515 -0.88796906295
2 1.74129314357 -0.7760203940  0.2435879550  0.96297913339
3 2.33400909898  0.0439339562  1.0221119115  0.07875704254
4 2.65188422088 -0.1230319426  1.6562415384  0.18348716525
5 1.69440143996  0.6049393761  1.0446174220  0.87828319489
6 1.43499026729 -0.2976883919  0.7316561774  0.43665437272
        X11       X12
1 -1.7844737347 -2.1649063167
2  0.2034972031 -1.7478010604
3  0.9186460991 -0.3217861157
4  1.3983604989 -1.4887151593
5  1.0934001840 -1.8538057112
6  0.8168093363 -0.6653136097

> #all columns have different values specified
> apply(eg.data,2,function(col){return(length(unique(col)))})
      bottle          cat        chair         face        house 
       2            2            2            2            2 
    scissors scrambledpix         shoe           X1           X2 
       2            2            2          864          864 
      X3           X4           X5           X6           X7 
     864          863          864          864          864 
      X8           X9          X10          X11          X12 
     864          864          863          864          863 
> apply(eg.data[,1:8],2,table)
  bottle cat chair face house scissors scrambledpix shoe
0    756 756   756  756   756      756          756  756
1    108 108   108  108   108      108          108  108

然后我尝试从插入符号运行train()。我最终希望使用neuralnet方法(我认为)要求将目标变量格式化为二项式变量的集合,但是现在我只是尝试svmLinear,这很有效对于测试用例。

> res <- train(train.formula,
+              eg.data,
+              method = "svmLinear",
+              trControl = trainControl(method="cv", number=10))
Error in cut.default(y, unique(quantile(y, probs = seq(0, 1, length = cuts))),  : 
  invalid number of intervals    

正在运行traceback

> traceback()
8: stop("invalid number of intervals")
7: cut.default(y, unique(quantile(y, probs = seq(0, 1, length = cuts))), 
       include.lowest = TRUE)
6: cut(y, unique(quantile(y, probs = seq(0, 1, length = cuts))), 
       include.lowest = TRUE)
5: createFolds(y, trControl$number, returnTrain = TRUE)
4: train.default(x, y, weights = w, ...)
3: train(x, y, weights = w, ...)
2: train.formula(train.formula, eg.data, method = "svmLinear", trControl = trainControl(method = "cv", 
       number = 10))
1: train(train.formula, eg.data, method = "svmLinear", trControl = trainControl(method = "cv", 
       number = 10))

正如您所看到的,它似乎是一个类似于之前报道的问题,但这里肯定没有NANaN值,所以我不知道解决方案会是什么是

0 个答案:

没有答案