我试图通过使用线性SVM模型与RBF内核模型测试我的数据来测试我的数据是否可线性分离,以使用F分数查看两者之间哪个得分更好。
我正在使用插入符号包,我已经设置了两个模型,fit.SVMKernel
使用了' rbf' fit.SVML
使用' svmLinear'。
我可以将我的脚本的整个长度向下运行到线性模型并且执行没有任何问题但是一旦我运行内核模型,我得到致命的错误消息并且必须重新启动会话,请参阅附件图片如下。
每次运行代码的内核部分时,是否有人可以提供有关为什么R崩溃的建议?
以下是我的data和我的code的可复制版本,适用于任何可以提供两次嘘声的人!
我按照以下
加载了插入包> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
locale:
[1] LC_COLLATE=English_Ireland.1252 LC_CTYPE=English_Ireland.1252 LC_MONETARY=English_Ireland.1252 LC_NUMERIC=C
[5] LC_TIME=English_Ireland.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] kernlab_0.9-25 doMC_1.3.4 iterators_1.0.8 foreach_1.4.3 Amelia_1.7.4 Rcpp_0.12.9 dplyr_0.5.0 pROC_1.9.1 klaR_0.6-12
[10] MASS_7.3-45 caret_6.0-73 ggplot2_2.2.1 lattice_0.20-34
loaded via a namespace (and not attached):
[1] compiler_3.3.2 nloptr_1.0.4 plyr_1.8.4 class_7.3-14 tools_3.3.2 digest_0.6.12 lme4_1.1-12 tibble_1.2
[9] nlme_3.1-128 gtable_0.2.0 mgcv_1.8-15 Matrix_1.2-7.1 DBI_0.5-1 SparseM_1.74 e1071_1.6-8 stringr_1.2.0
[17] MatrixModels_0.4-1 stats4_3.3.2 combinat_0.0-8 grid_3.3.2 nnet_7.3-12 R6_2.2.0 foreign_0.8-67 minqa_1.2.4
[25] reshape2_1.4.2 car_2.1-4 magrittr_1.5 scales_0.4.1 codetools_0.2-15 ModelMetrics_1.1.0 splines_3.3.2 assertthat_0.1
[33] pbkrtest_0.4-6 colorspace_1.3-2 labeling_0.3 quantreg_5.29 stringi_1.1.2 lazyeval_0.2.0 munsell_0.4.3
我试图对(training.data.raw $ Overshooting)进行分类的结果变量属于类型因素:
> str(training.data.raw)
'data.frame': 2846 obs. of 19 variables:
$ Total.Tx.Height : num 31.2 31.2 31.2 31.2 31.2 ...
$ Antenna.Tilt : int 0 0 0 0 0 0 4 4 2 2 ...
$ Antenna.Gain : num 15.9 15.9 15.9 18.2 18.2 18.2 15.9 15.9 18.8 18.8 ...
$ Ant.Vert.Beamwidth: num 10 10 10 4.4 4.4 4.4 9.6 9.6 4.3 4.3 ...
$ RTWP : num -106 -104 -105 -105 -105 ...
$ Voice.Drops : int 1 12 1 0 1 5 1 18 4 3 ...
$ Range : num 11.33 5.14 5.14 11.33 3.88 ...
$ Max.Distance : num 12.43 6.24 6.24 12.43 4.98 ...
$ Environment : Factor w/ 3 levels "Rural","Suburban",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Rural : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
$ Suburban : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Urban : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Overshooting : Factor w/ 2 levels "0","1": 1 1 2 2 2 2 2 2 2 1 ...
$ OShooting : Factor w/ 2 levels "Not.Overshooting",..: 1 1 2 2 2 2 2 2 2 1 ...
$ HSUPA.Throughput : num 164 223 232 241 264 ...
$ Max.HSDPA.Users : int 10 16 9 5 8 7 14 31 8 12 ...
$ HS.DSCH.throughput: num 1975 2346 2995 3696 3894 ...
$ Max.HSUPA.Users : int 13 25 11 5 13 9 15 33 8 13 ...
$ Avg.CQI : num 16.2 18 19.4 19.7 21.8 ...
以下是我用于两种模型的训练控制的代码:
metric <- "Accuracy"
train_Control <- trainControl(method = "repeatedCV",
number = 10,
repeats = 3),
classProbs = T)
以下是我创建的两个模型fit.SVMKernel
和fit.SVMLinear
set.seed(123)
fit.SVML <- train(Overshooting ~ Total.Tx.Height +
Antenna.Tilt +
Antenna.Gain +
Ant.Vert.Beamwidth +
RTWP +
Voice.Drops +
Range +
Max.Distance +
Rural +
Suburban +
Urban +
HSUPA.Throughput +
Max.HSDPA.Users +
HS.DSCH.throughput +
Max.HSUPA.Users +
Avg.CQI,
data = training.data.raw,
method = 'svmLinear',
preProcess = c('center','scale'),
trControl=train_Control,
tuneLength=5,
metric = metric)
set.seed(123)
fit.SVMKernel <- train(Overshooting ~ Total.Tx.Height +
Antenna.Tilt +
Antenna.Gain +
Ant.Vert.Beamwidth +
RTWP +
Voice.Drops +
Range +
Max.Distance +
Rural +
Suburban +
Urban +
HSUPA.Throughput +
Max.HSDPA.Users +
HS.DSCH.throughput +
Max.HSUPA.Users +
Avg.CQI,
data = training.data.raw,
method = 'rbf',
preProcess = c('center','scale'),
trControl=train_Control,
tuneLength=5,
metric = metric)#,
#summaryFunction=twoClassSummary)
答案 0 :(得分:0)
原来我调用的方法是火车中的模型不是有效的模型名称。我把它改成了一个我认识的工作,我已经回到了商业中。