Question

我正在对我的数据集进行标准化

def standardization(new_df2, labelcol):
    from sklearn.preprocessing import StandardScaler
    labels = new_df2[labelcol]
    del new_df2[labelcol]
    scaled_features = StandardScaler().fit_transform(new_df2.values)
    new_df3 = pd.DataFrame(scaled_features, index = new_df2, columns = 
       new_df2.columns)
    new_df3[labelcol] = labels

    return new_df3

    labelcol = new_df2.population     #population is one of the columns in dataframe
    new_df3 = standardization(new_df2, labelcol)
    print(new_df3)

我收到以下错误！

KeyError: '[  322.  2401.   496. ...,  1007.   741.  1387.] not in index'

到目前为止，我322, 2401, ...列中的值为population。

请帮我解决这个错误。这意味着什么？

P.S：new_df2 = (20640, 14)和labelcol.shape = (20640,)

Answer 1

以下代码解决了我的问题

def standardization(new_df2, labelcol):

    dflabel = new_df2[[labelcol]]
    std_df = new_df2.drop(labelcol, 1)
    scaled_features = StandardScaler().fit_transform(std_df.values)
    new_df3 = pd.DataFrame(scaled_features, columns = std_df.columns)
    new_df3 = pd.concat([dflabel, new_df3], axis=1)

    return new_df3

感谢那些尝试过帮助的人。

KeyError：＆＃39; [......]不在索引＆＃39;

1 个答案: