我正在寻找一种将使用scikit sklearn训练的决策树转换为决策表的方法。
我想知道如何解析决策树结构以查找在每个步骤中做出的决策。
然后,我想了解如何构造此表的想法。
您知道一种方法或有一个主意吗?
答案 0 :(得分:0)
这里是将决策树转换为“ python”代码的示例代码。您可以轻松地将其修改为表格。
您需要做的就是创建一个全局变量,该变量是一个表,该表的大小是叶子数乘以要素(或要素类别)数的乘积并递归填充
def tree_to_code(tree, feature_names, classes_names):
tree_ = tree.tree_
feature_name = [
feature_names[i] if i != _tree.TREE_UNDEFINED else "undefined!"
for i in tree_.feature
]
print( "def tree(" + ", ".join(feature_names) + "):" )
def recurse(node, depth):
indent = " " * depth
if tree_.feature[node] != _tree.TREE_UNDEFINED:
name = feature_name[node]
threshold = tree_.threshold[node]
print( indent + "if " + name + " <= " + str(threshold)+ ":" )
recurse(tree_.children_left[node], depth + 1)
print( indent + "else: # if " + name + "<=" + str(threshold) )
recurse(tree_.children_right[node], depth + 1)
else:
impurity = tree.tree_.impurity[node]
dico, label = cast_value_to_dico( tree_.value[node], classes_names )
print( indent + "# impurity=" + str(impurity) + " count_max=" + str(dico[label]) )
print( indent + "return " + str(label) )
recurse(0, 1)
答案 1 :(得分:0)
以other answer here为基础。以下内容以相同的方式遍历树,但是生成一个熊猫数据帧作为输出。
import sklearn
import pandas as pd
def tree_to_df(reg_tree, feature_names):
tree_ = reg_tree.tree_
feature_name = [
feature_names[i] if i != sklearn.tree._tree.TREE_UNDEFINED else "undefined!"
for i in tree_.feature
]
def recurse(node, row, ret):
if tree_.feature[node] != sklearn.tree._tree.TREE_UNDEFINED:
name = feature_name[node]
threshold = tree_.threshold[node]
# Add rule to row and search left branch
row[-1].append(name + " <= " + str(threshold))
recurse(tree_.children_left[node], row, ret)
# Add rule to row and search right branch
row[-1].append(name + " > " + str(threshold))
recurse(tree_.children_right[node], row, ret)
else:
# Add output rules and start a new row
label = tree_.value[node]
ret.append("return " + str(label[0][0]))
row.append([])
# Initialize
rules = [[]]
vals = []
# Call recursive function with initial values
recurse(0, rules, vals)
# Convert to table and output
df = pd.DataFrame(rules).dropna(how='all')
df['Return'] = pd.Series(values)
return df
答案 2 :(得分:-1)
从 sklearn.datasets 导入 load_iris
从 sklearn.tree 导入决策树分类器
从 sklearn.tree 导入 export_text
iris = load_iris()
X = 虹膜['数据']
y = iris['目标']
decision_tree = DecisionTreeClassifier(random_state=0, max_depth=2)
decision_tree = decision_tree.fit(X, y)
r = export_text(decision_tree, feature_names=iris['feature_names'])
打印(r)
listt= [r]
打印(列表)
#########OUTPUT##########################
|--- 花瓣宽度 (cm) <= 0.80
| |--- class: 0
|--- 花瓣宽度(cm) > 0.80
| |--- petal width (cm) <= 1.75
| | |--- class: 1
| |--- 花瓣宽度(cm) > 1.75
| | |--- 班级:2