我正在对我的数据集进行标准化
def standardization(new_df2, labelcol):
from sklearn.preprocessing import StandardScaler
labels = new_df2[labelcol]
del new_df2[labelcol]
scaled_features = StandardScaler().fit_transform(new_df2.values)
new_df3 = pd.DataFrame(scaled_features, index = new_df2, columns =
new_df2.columns)
new_df3[labelcol] = labels
return new_df3
labelcol = new_df2.population #population is one of the columns in dataframe
new_df3 = standardization(new_df2, labelcol)
print(new_df3)
我收到以下错误!
KeyError: '[ 322. 2401. 496. ..., 1007. 741. 1387.] not in index'
到目前为止,我322, 2401, ...
列中的值为population
。
请帮我解决这个错误。这意味着什么?
P.S:new_df2
= (20640, 14)
和labelcol.shape
= (20640,)
答案 0 :(得分:3)
以下代码解决了我的问题
def standardization(new_df2, labelcol):
dflabel = new_df2[[labelcol]]
std_df = new_df2.drop(labelcol, 1)
scaled_features = StandardScaler().fit_transform(std_df.values)
new_df3 = pd.DataFrame(scaled_features, columns = std_df.columns)
new_df3 = pd.concat([dflabel, new_df3], axis=1)
return new_df3
感谢那些尝试过帮助的人。