快速相关滤波器选择

时间:2018-03-26 19:51:17

标签: r feature-extraction

我想在我的训练集上实现FCFS,使用r包(Biocomb)选择最相关和非冗余的因变量,然后测试svm上的所选特征,以确保它们在分类器性能(svm)上的效率,但我没有' t了解参数attrs.nominal我是否必须在我的数据集中添加所有名义变量才能将它们离散化?任何解释将不胜感激。 [.data.frame中的错误(data.validation ,, 72):   选择了未定义的列 换行时出错:无法打开连接

#fast correlation based feature selection in r
library(Biocomb)
library(caret)
library(e1071)
#split data into train and test
trainIndex <- createDataPartition(data$Species, p=0.7, list=FALSE)
data_train <- data[ trainIndex,]
data_test <- data[-trainIndex,]
set.seed(10)
#dicretisation of numericl variable
disc<-"MDL"
#a numeric threshold v for the correlation of feature with class (final subset)
threshold=0.2
#a numerical vector, containing the column numbers of the nominal features
attrs.nominal=72
out=select.fast.filter(data_train, disc.method=disc, threshold=threshold,attrs.nominal=attrs.nominal)
out=unlist(out)
#Using selected features to train svm
svm_model<-svm(Species~out,data_train[,out],cost=.1,kernel="radial")
    #Predict test set
p<-predict(svm_model,data_test[,-5])
accuracy=mean(p==data_test[,5])

0 个答案:

没有答案