Getting the feature importance in a SVM

时间:2019-01-07 13:49:12

标签: r svm mlr

I did a multiclass (3) classification using a SVM with a linear kernel.

For this task, I used the mlr package. The SVM is from the kernlab package.

library(mlr)
library(kernlab)

print(filtered_task)

Supervised task: dtm
Type: classif
Target: target_lable
Observations: 1462
Features:
   numerics     factors     ordered functionals 
        291           0           0           0 
Missings: FALSE
Has weights: FALSE
Has blocking: FALSE
Has coordinates: FALSE
Classes: 3
negative  neutral positive 
     917      309      236 
Positive class: NA

lrn = makeLearner("classif.ksvm", par.vals = list(kernel = "vanilladot"))
mod = mlr::train(lrn, train_task)

Now I want to know which features have the highest weights for each class. Any idea how to get there?

Moreover, it would be nice to get the feature weights for each class for the cross-validation result.

rdesc = makeResampleDesc("CV",
                         iters = 10,
                         stratify = T) 
set.seed(3)
r = resample(lrn, filtered_task, rdesc)

I know that there is the possibility to calculate the feature importance like below, which is similar to the cross-validation results because of the Monte-Carlo iterations.

imp = generateFeatureImportanceData(task = train_task, 
                                    method = "permutation.importance", 
                                    learner = lrn,
                                    nmc = 10)

However, for this method I can´t get the feature importance for each class but only the importance overall.

library(dplyr)
library(ggplot)

imp_data = melt(imp$res[, 2:ncol(imp$res)]) 

imp_data = imp_data %>% 
  arrange(-value)

imp_data[1:10,] %>% 
  ggplot(aes(x = reorder(variable, value), y = value)) + 
  geom_bar(stat = "identity",  fill = "darkred") + 
  labs(x = "Features", y = "Permutation Importance") +
  coord_flip() +
  theme_minimal()

enter image description here

0 个答案:

没有答案