我找到了有关使用excel的pyxll插件的决策树算法的教程,并尝试执行。我收到一个错误:KeyError:在轴上找不到“ ['class']”。
from pyxll import xl_func
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
import pandas as pd
import os
@xl_func("float, int, int: object")
def ml_get_zoo_tree_2(train_size=0.75, max_depth=5, random_state=245245):
# Load the zoo data
dataset = pd.read_csv(os.path.join(os.path.dirname(__file__), "zoo.csv"))
# Drop the animal names since this is not a good feature to split the data on
dataset = dataset.drop("animal_name", axis=1)
# Split the data into a training and a testing set
features = dataset.drop("class", axis=1)
targets = dataset["class"]
train_features, test_features, train_targets, test_targets = \
train_test_split(features, targets, train_size=train_size, random_state=random_state)
# Train the model
tree = DecisionTreeClassifier(criterion="entropy", max_depth=max_depth)
tree = tree.fit(train_features, train_targets)
# Add the feature names to the tree for use in predict function
tree._feature_names = features.columns
return tree
如果我为类代码删除了第17行和第18行,那么我会收到错误NameError:未定义名称'features',那么当我删除功能时,我必须定义目标错误。
答案 0 :(得分:0)
该教程需要正确的数据集。您可以从这里https://github.com/pyxll/pyxll-examples/tree/master/machine-learning下载它(和代码)。