我目前有一个包含35个功能的列表,我们用它来预测和结果。我理解折旧错误的本质,我对如何解决的理解是通过重新整形np数组来解释每个特征:
X = X.reshape(len(FEATURES), -1)
但是,我不确定它是否应该包含在Build_Data_Set函数或for循环之前的Analysis函数中?
FEATURES = [...#list of 35 columns for the below dataframe]
def Build_Data_Set():
data_df = pd.read_csv('{}{}'.format(path, key_stats_csv))
# data[features], means we take all of the 35 features of data, convert them to just those values
X = np.array(data_df[FEATURES].values)
y = (data_df['status'].replace('underperform', 0).replace('outperform', 1)
.values.tolist())
X = preprocessing.scale(X)
return X, y
def Analysis():
test_size = 1000
X, y = Build_Data_Set()
print(len(X))
clf = svm.SVC(kernel='linear', C=1.0)
clf.fit(X[:test_size], y[:test_size])
correct_count = 0
for x in range(1, test_size + 1):
if clf.predict(X[-x])[0] == y[-x]:
correct_count += 1
print('Accuray: {}'.format((correct_count / test_size) * 100))
Analysis()
错误:
DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17
and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or
X.reshape(1, -1) if it contains a single sample.