以下决策树代码之间有什么区别

时间:2019-05-20 13:25:27

标签: decision-tree python-3.7

以下代码之间的唯一区别是,一个具有在show_tree函数中定义的“功能”,而另一个则没有。两者都显示不同的决策树。我的问题是没有定义“功能”的代码,不应该显示同一棵树,因为功能已经在dt = c.fit(x_train,y_train)的x中定义,并且此dt是show_tree

代码1:

data = pd.read_csv('soundcloud.csv')
print(data)

features = ['danceability','loudness','valence','energy','instrumentalness','acousticness','key','speechiness','duration_ms']

y = data['target']
x = data[features]

x_train,x_test,y_train,y_test = train_test_split(x,y,test_size = 0.15)

c = DecisionTreeClassifier(min_samples_split=100)

dt = c.fit(x_train,y_train)

def show_tree(dt,path):
    f = io.StringIO()
    export_graphviz(dt, out_file=f)
    pydotplus.graph_from_dot_data(f.getvalue()).write_png(path)
    img = misc.imread(path)
    plt.rcParams['figure.figsize'] = (20,20)
    plt.imshow(img)

show_tree(dt,'dec_tree_01.png')

代码2:

data = pd.read_csv('data.csv')
print(data)

train,test = train_test_split(data, test_size = 0.15)

c = DecisionTreeClassifier(min_samples_split=100)

features = ['danceability','loudness','valence','energy','instrumentalness','acousticness','key','speechiness','duration_ms']

x_train = train[features]
y_train = train['target']

x_test = test[features]
y_test = test['target']

dt = c.fit(x_train,y_train)

def show_tree(tree, features, path):
    f = io.StringIO()
    export_graphviz(tree, out_file=f, feature_names=features)
    pydotplus.graph_from_dot_data(f.getvalue()).write_png(path)
    img = misc.imread(path)
    plt.rcParams['figure.figsize'] = (20,20)
    plt.imshow(img)

show_tree(dt,features,'dec_tree_01.png')

0 个答案:

没有答案