决策树:Python中预测的概率成反比

时间:2018-12-26 19:29:35

标签: python machine-learning scikit-learn classification decision-tree

我想创建与决策树中每个类别成反比的预测概率。类似于4.1第9页公式中的here。 如何参考我的代码来做到这一点:

import numpy as np
import pandas as pd
from sklearn.cross_validation import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn import tree
url="https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data"
c=pd.read_csv(url, header=None)
X = c.values[:,1:8]
Y = c.values[:,0]
X_train, X_test, y_train, y_test = train_test_split( X, Y, test_size = 0.3, random_state = 100)
clf_entropy = DecisionTreeClassifier(criterion = "entropy", random_state = 100,
 max_depth=3, min_samples_leaf=5)
clf_entropy.fit(X_train, y_train)
probs = clf_entropy.predict_proba(X_test)
probs

目标是将零概率替换为 较小的非零值,并标准化概率使其成为分布。 然后选择标签,以使选择的概率成反比 与当前树的预测成比例。 enter image description here

1 个答案:

答案 0 :(得分:1)

可以使用以下代码段实现上述公式。

def inverse_prob(model_probs):
    model_probs[model_probs == 0 ] = 1e-5
    inverse = 1/model_probs
    return inverse/inverse.sum(axis=0)

每当给定的概率分布中包含零值时,添加一个小值1e-5。