无法运行点,用graphviz在python中可视化决策树

时间:2018-01-18 19:17:52

标签: python-3.x pandas scikit-learn graphviz decision-tree

我想在Python中使用scikit-learn设计一个简单的决策树

以下StackOverflow问题对我来说似乎也不起作用:Question

当我运行代码时,我总是陷入异常部分

有人可以帮助我了解如何在scikit-learn中显示决策树吗?

我的完整代码:

import subprocess

import pandas as pd
from sklearn.tree import DecisionTreeClassifier, export_graphviz

data = pd.read_csv(filepath_or_buffer="DecisionEcolev2.csv")

print("\n==df.head()==\n", data.head())
print("\n==df.tail()==\n", data.tail())
print("\n==Value counts==\n", data["Decision"].value_counts())
print("\n==Data types==\n", data["Decision"].unique())


def encode_target(df, target_column):
    df_mod = df.copy()
    T = df_mod[target_column].unique()
    map_to_int = {name: n for n, name in enumerate(T)}
    df_mod["Target"] = df_mod[target_column].replace(map_to_int)
    return df_mod, T


df2, targets = encode_target(data, "Decision")

print("\n==df2.head()==\n", df2[["Target", "Decision"]].head())
print("\n==df2.tail()==\n", df2[["Target", "Decision"]].tail())
print("\n==targets==\n", targets)

features = list(df2.columns[:4])
print("\n==features==\n", features)

# fitting the decision tree with scikit-learn
y = df2["Decision"]
X = df2[features]
dt = DecisionTreeClassifier(min_samples_split=20, random_state=80)
dt.fit(X, y)


def visualize_tree(tree, feature_names):
    with open("dt.dot", 'w') as f:
        export_graphviz(tree, out_file=f, feature_names=feature_names)
    command = ["dot", "-Tpng", "dt.dot", "-o", "ad.png"]
    try:
        subprocess.check_call(command)
    except:
        exit("Could not run dot, ie graphviz, to produce visualization")


visualize_tree(dt, features)

输出:

==df.head()==
    Devoir  MamanBonneHumeur  FaitBeau  GouterPris Decision
0       1                 0         1           0      oui
1       0                 1         0           1      oui
2       1                 1         1           0      oui
3       1                 0         1           1      oui
4       0                 1         1           1      non

==df.tail()==
    Devoir  MamanBonneHumeur  FaitBeau  GouterPris Decision
3       1                 0         1           1      oui
4       0                 1         1           1      non
5       0                 1         0           0      non
6       1                 0         0           1      non
7       1                 1         0           0      non

==Value counts==
 oui    4
non    4
Name: Decision, dtype: int64

==Data types==
 ['oui' 'non']

==df2.head()==
    Target Decision
0       0      oui
1       0      oui
2       0      oui
3       0      oui
4       1      non

==df2.tail()==
    Target Decision
3       0      oui
4       1      non
5       1      non
6       1      non
7       1      non

==targets==
 ['oui' 'non']

==features==
 ['Devoir', 'MamanBonneHumeur', 'FaitBeau', 'GouterPris']
Could not run dot, ie graphviz, to produce visualization

Process finished with exit code 1

0 个答案:

没有答案