Question

我目前有一个包含35个功能的列表，我们用它来预测和结果。我理解折旧错误的本质，我对如何解决的理解是通过重新整形np数组来解释每个特征：

X = X.reshape(len(FEATURES), -1)

但是，我不确定它是否应该包含在Build_Data_Set函数或for循环之前的Analysis函数中？

FEATURES = [...#list of 35 columns for the below dataframe]

def Build_Data_Set():
    data_df = pd.read_csv('{}{}'.format(path, key_stats_csv))


    #  data[features], means we take all of the 35 features of data, convert them to just those values

    X = np.array(data_df[FEATURES].values)
    y = (data_df['status'].replace('underperform', 0).replace('outperform', 1)
                                                     .values.tolist())

    X = preprocessing.scale(X)
    return X, y


def Analysis():

    test_size = 1000

    X, y = Build_Data_Set()

    print(len(X))

    clf = svm.SVC(kernel='linear', C=1.0)
    clf.fit(X[:test_size], y[:test_size])

    correct_count = 0

    for x in range(1, test_size + 1):
        if clf.predict(X[-x])[0] == y[-x]:
            correct_count += 1

    print('Accuray: {}'.format((correct_count / test_size) * 100))


Analysis()

错误：

DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17     
and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or 
X.reshape(1, -1) if it contains a single sample.

如何重塑线性SVC

0 个答案: