我正在尝试将决策树导出为具有所有类别字段原始标签的图像。
我当前的数据如下:
我将分类特征转换为数字:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('data.csv')
X = dataset.iloc[:, 0:4]
y = dataset.iloc[:, 4]
from sklearn.preprocessing import LabelEncoder
lb = LabelEncoder()
X['Outlook'] = lb.fit_transform(X['Outlook'])
X['Temp'] = lb.fit_transform(X['Temp'])
X['Humidity'] = lb.fit_transform(X['Humidity'])
X['Windy'] = lb.fit_transform(X['Windy'])
y = lb.fit_transform(y)
然后,我应用了DecisionTreeClassifier
:
from sklearn.tree import DecisionTreeClassifier
dtc = DecisionTreeClassifier(criterion="entropy")
dtc.fit(X, y)
最后,我需要使用以下方法检查从模型生成的树:
# Import tools needed for visualization
from sklearn.tree import export_graphviz
import pydot
# Pull out one tree from the forest
# Export the image to a dot file
export_graphviz(dtc, out_file = 'tree.dot', feature_names = X.columns, rounded = True, precision = 1)
# Use dot file to create a graph
(graph, ) = pydot.graph_from_dot_file('tree.dot')
# Write graph to a png file
graph.write_png('tree.png')
tree.png
:
但是我真正需要的是查看节点内部或每个分支的每个要素的主要标签,而不是true
或false
或数字表示形式。
我尝试了以下操作:
y=lb.inverse_transform(y)
与X个要素相同,但是生成的树与上面相同。