我有一个VotingClassifier,由200个单独的SVM分类器组成。默认情况下,此分类器使用多数规则投票。我想设置一个自定义阈值-仅在60%或更多的SVM分类器相同时才进行分类。
如果59%的SVM分类器具有相同的分类,则我不希望集成模型进行分类。
我没有看到为# [[1]]
# [1] "a"
# [[2]]
# [1] "a" "b"
# [[3]]
# [1] "a" "b"
# [[4]]
# [1] "b"
# [[5]]
# [1] "b"
对象执行此操作的参数,但是我认为它必须在scikit-learn中的某处可行。我应该使用其他合奏类吗?
答案 0 :(得分:1)
根据您在页面末尾获得的方法,最简单的解决方案是使用transform方法:
def transform(self, X):
"""Return class labels or probabilities for X for each estimator.
Parameters
----------
X : {array-like, sparse matrix}, shape = [n_samples, n_features]
Training vectors, where n_samples is the number of samples and
n_features is the number of features.
Returns
-------
If `voting='soft'` and `flatten_transform=True`:
array-like = (n_classifiers, n_samples * n_classes)
otherwise array-like = (n_classifiers, n_samples, n_classes)
Class probabilities calculated by each classifier.
If `voting='hard'`:
array-like = [n_samples, n_classifiers]
Class labels predicted by each classifier.
"""
只需执行一个简单的函数即可获得一行的总和除以SVM的数量,然后应用您的阈值:
if(ratio>threshold):
return 1
elif(ratio<(1-threshold)):
return 0
else:
#we don't make the prediction
return -1