R:svmRadial在插入符号中无法正常工作

时间:2017-11-21 11:52:21

标签: r svm r-caret

我试图运行来自" Applied Predictive Modeling"本书,关于通过插入符号训练SVM与径向内核的部分" train"功能

我没有添加任何内容就复制了代码。代码运行没有任何错误,但结果与书中写的不一致。所有概率几乎相同,所有对象都分为一类。这是代码:

library(caret)
data("GermanCredit")
GermanCredit <- GermanCredit[, -nearZeroVar(GermanCredit)]
# remove some other columns that do not add useful information
GermanCredit$CheckingAccountStatus.lt.0 <- NULL
GermanCredit$SavingsAccountBonds.lt.100 <- NULL
GermanCredit$EmploymentDuration.lt.1 <- NULL
GermanCredit$EmploymentDuration.Unemployed <- NULL
GermanCredit$Personal.Male.Married.Widowed <- NULL
GermanCredit$Property.Unknown <- NULL
GermanCredit$Housing.ForFree <- NULL

#Split the data into training (80%) and test sets (20%)
set.seed(100)
inTrain <- createDataPartition(GermanCredit$Class, p = .8)[[1]]
GermanCreditTrain <- GermanCredit[ inTrain, ]
GermanCreditTest  <- GermanCredit[-inTrain, ]

set.seed(1056)
svmFit <- train(Class ~ .,
           data = GermanCreditTrain,
           method = "svmRadial",
           preProcess = c("center", "scale"),
           tuneLength = 10,
           trControl = trainControl(method = "repeatedcv",                                        repeats = 5,
                                    classProbs = TRUE)) 

模型的输出如下:

> svmFit
Support Vector Machines with Radial Basis Function Kernel 

800 samples
 41 predictor
  2 classes: 'Bad', 'Good' 

Pre-processing: centered (41), scaled (41) 
Resampling: Cross-Validated (10 fold, repeated 5 times) 
Summary of sample sizes: 720, 720, 720, 720, 720, 720, ... 
Resampling results across tuning parameters:

  C       Accuracy  Kappa      
    0.25  0.70025   0.006361713
    0.50  0.70025   0.006372290
    1.00  0.70025   0.006372290
    2.00  0.70075   0.008001058
    4.00  0.70100   0.009101928
    8.00  0.69950   0.004902168
   16.00  0.70050   0.006864093
   32.00  0.70025   0.006361713
   64.00  0.70050   0.007509254
  128.00  0.70050   0.007472237

Tuning parameter 'sigma' was held constant at a value of 0.01390712
Accuracy was used to select the optimal model using  the largest value.
The final values used for the model were sigma = 0.01390712 and C = 4.

因此,准确性甚至不会改变。我尝试了不同的参数集,但结果是一样的。

所有样本的概率几乎相同:约为0.304&#34;差&#34;上课,~0.695为&#34;好&#34; (差异仅在第四位)。

本书的结果可在此处获取:https://github.com/cran/AppliedPredictiveModeling/blob/master/inst/chapters/04_Over_Fitting.Rout

他们有

> svmFit
Support Vector Machines with Radial Basis Function Kernel 

800 samples
 41 predictors
  2 classes: 'Bad', 'Good' 

Pre-processing: centered, scaled 
Resampling: Cross-Validated (10 fold, repeated 5 times) 

Summary of sample sizes: 720, 720, 720, 720, 720, 720, ... 

Resampling results across tuning parameters:

  C     Accuracy  Kappa  Accuracy SD  Kappa SD
  0.25  0.744     0.362  0.0499       0.113   
  0.5   0.74      0.35   0.0516       0.117   
  1     0.746     0.348  0.0522       0.125   
  2     0.743     0.325  0.0467       0.116   
  4     0.744     0.322  0.0477       0.12    
  8     0.75      0.323  0.0464       0.13    
  16    0.745     0.302  0.0457       0.13    
  32    0.739     0.28   0.0451       0.126   
  64    0.743     0.284  0.0444       0.135   
  128   0.734     0.265  0.0445       0.124   

Tuning parameter 'sigma' was held constant at a value of 0.008918477
Accuracy was used to select the optimal model using  the largest     value.
The final values used for the model were sigma = 0.00892 and C = 8. 

此外,整个班级都得到了这样的结果,但是老师,他的电脑有较旧版本的R,得到了正确的结果。所以这是我的问题:R,插入符号,kernlab等新版本中的某些更改中的问题,还是我对其他内容做错了?如何更改此代码以获得正确的结果? Caret版本是6.0-77。

提前致谢。

0 个答案:

没有答案