我相信我有一个简单但不是微不足道的问题。我有统计学背景,我倾向于使用Stata和R。我有兴趣学习Python。我现在用了一段时间,最近接触了scikit-learn。
我正在尝试重现一个我从谷歌获得的简单示例。
#the good old iris data
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = .5)
#let's learn a little
from sklearn import tree
nordan_tree = tree.DecisionTreeClassifier()
clf = nordan_tree.fit(X_train, y_train)
predictions = nordan_tree.predict(X_test)
#accuracy score
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, predictions))
#print tree
from sklearn.externals.six import StringIO
with open("iris.dot", 'w') as f:
f = tree.export_graphviz(clf, out_file=f)
import os
os.unlink('iris.dot')
代码运行顺畅 - 但我在哪里可以找到我的iris.pdf?