我有以下代码实现了scikit-learn的决策树分类器:
import numpy as np
import pandas as pd
from sklearn import tree
# #---------------------------------------------------------------------------------------------------
with open('data/training.csv', 'r') as f:
df = pd.read_csv(f, index_col=None)
Subset = df.iloc[:, 32:33] # Just the labels
df['Num_Labels'] = df.Label.map(lambda x: '-1' if x == 's' else '1') # Convert labels to '0' or '1'.
Z = df.iloc[:, 32:34] # the letter labels & numerical labels
Train_values = df.iloc[:, 1:31].values
Train_labels = df.iloc[:, 33:34].values
with open('data/test.csv', 'r') as f2:
df2 = pd.read_csv(f2, index_col=None)
Test_values = df2.iloc[:, 1:31].values
# #----------------------------------------------------------------------------------------------
X = Train_values
Y = Train_labels.astype(np.float)
print X.dtype
print Y.dtype
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X, Y)
Pred = clf.predict(Test_values)
print Pred.dtype
Out = Pred.astype(np.float)
np.savetxt('Output_Numerical.csv', Out, delimiter=' ')
到目前为止,代码按预期工作。然而,之后我想将标签转换回原来的字符值,' s'并且' h'。我写了以下内容:
Out2 = Pred.astype(str) # Initialize
print "Out2's type is:"
print Out2.dtype
for i in range(0, len(Out)):
if Out[i] == -1:
Out2[i] == 's'
else:
Out2[i] == 'h'
print Out2
但它并没有改变Out2的值。
答案 0 :(得分:3)
这很简单,即使错误不是你想的那样:
for i in range(0, len(Out)):
if Out[i] == -1:
Out2[i] == 's'
else:
Out2[i] == 'h'
在过去两次使用单=
而不是==
!现在发生的是声明Out2[1] == 's'
等于False
,没有人有兴趣使用它。因此,它不是一个非法的构造,并且口译员没有理由抱怨它。