Question

我在Macbook OSX 10.2.1（Sierra）上运行Python 3.5.2。

在尝试从Kaggle运行泰坦尼克数据集的某些代码时，我不断收到以下错误：

NotFittedError Traceback（最近一次调用   最后）in（）         6         7＃使用测试集进行预测并打印。   ----＆GT; 8 my_prediction = my_tree_one.predict（test_features）         9打印（my_prediction）        10

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/tree/tree.py   在预测中（self，X，check_input）       429“”“       430    - ＆GT; 431 X = self._validate_X_predict（X，check_input）       432 proba = self.tree_.predict（X）       433 n_samples = X.shape [0]

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/tree/tree.py   在_validate_X_predict中（self，X，check_input）       386“”“每当试图预测，应用，predict_proba时验证X”“”       387如果self.tree_为None：    - ＆GT; 388引发NotFittedError（“Estimator not fitted，”       389“在使用模型之前调用fit。”）       390

NotFittedError：未安装Estimator，在利用之前调用fit   模型。

违规代码似乎是这样的：

# Impute the missing value with the median
test.Fare[152] = test.Fare.median()

# Extract the features from the test set: Pclass, Sex, Age, and Fare.
test_features = test[["Pclass", "Sex", "Age", "Fare"]].values

# Make your prediction using the test set and print them.
my_prediction = my_tree_one.predict(test_features)
print(my_prediction)

# Create a data frame with two columns: PassengerId & Survived. Survived contains your predictions
PassengerId =np.array(test["PassengerId"]).astype(int)
my_solution = pd.DataFrame(my_prediction, PassengerId, columns = ["Survived"])
print(my_solution)

# Check that your data frame has 418 entries
print(my_solution.shape)

# Write your solution to a csv file with the name my_solution.csv
my_solution.to_csv("my_solution_one.csv", index_label = ["PassengerId"])

以下是code其余部分的链接。

由于我已经调用了'fit'函数，我无法理解这个错误消息。我哪里错了？谢谢你的时间。

修改：事实证明，问题是从前一段代码继承的。

# Fit your first decision tree: my_tree_one
my_tree_one = tree.DecisionTreeClassifier()
my_tree_one = my_tree_one.fit(features_one, target)

# Look at the importance and score of the included features
print(my_tree_one.feature_importances_)
print(my_tree_one.score(features_one, target))

有了这条线： my_tree_one = my_tree_one.fit（features_one，target）

生成错误：

ValueError：输入包含NaN，无穷大或太大的值 D型（ 'FLOAT32'）。

Answer 1

错误是不言自明的：features_one或target数组确实包含NaN s或无限值，因此估算器无法拟合，因此您无法将其用于预测后面。

检查这些数组并在拟合之前相应地处理NaN值。

NotFittedError：Estimator未安装，在利用模型

1 个答案: