Python标签编码:决策树分类

时间:2021-03-08 06:06:06

标签: python python-3.x forecasting

我对 Python 非常陌生,正在尝试使用以下查询运行决策树模型:

from sklearn.metrics import classification_report, confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
import numpy as np
import pandas as pd
import sklearn as skl


data_forecast = pd.read_excel("./Forcast_data_Analytics.xlsx")

x = data_forecast[['Name','Power', 'FirstEventID','AlleventIds']]
y = data_forecast[['Possible_fix','Changes_Required']]

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.8)

classifier = DecisionTreeClassifier()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

样本数据:

Name       Power      FirstEventID      AlleventIds         Possible_fix        Changes_Required
India      I3000       10130-1           10130-1, 134-00     yes                 Bug Fix

决策树分类可以不用标签编码吗? 或者我是否需要对我的数据进行编码才能输入分类?

这样做的最佳方法是什么? 我想将所有内容都视为字符串并对其进行编码。 分类后,我也想解码。

我尝试了以下编码方法,但没有用:

from sklearn.preprocessing import LabelEncoder
vals = np.array(data_forecast)
LabelEncoder = LabelEncoder()
integer_encoded = LabelEncoder.fit_transform(vals)

错误:

Exception has occurred: ValueError
y should be a 1d array, got an array of shape (59, 23) instead.

这样做的正确方法是什么? 我如何编码/解码我的标签并使用它?

0 个答案:

没有答案