我对Standard Scaler有问题。
编码标签后,我得到了数字数据,但显示了此错误
回溯(最近通话最近): 文件“ c:\ Users \ barte.vscode \ extensions \ ms-python.python-2019.9.34911 \ pythonFiles \ ptvsd_launcher.py”,第43行,在 主要(ptvsdArgs) 主文件432行中的文件“ c:\ Users \ barte.vscode \ extensions \ ms-python.python-2019.9.34911 \ pythonFiles \ lib \ python \ ptvsd__main __。py” 跑() 在run_file中的第316行,文件“ c:\ Users \ barte.vscode \ extensions \ ms-python.python-2019.9.34911 \ pythonFiles \ lib \ python \ ptvsd__main __。py'' runpy.run_path(target,run_name ='主要') 文件“ C:\ Users \ barte \ AppData \ Local \ Programs \ Python \ Python36 \ Lib \ runpy.py”,行263,在run_path中 pkg_name = pkg_name,script_name = fname) 文件“ C:\ Users \ barte \ AppData \ Local \ Programs \ Python \ Python36 \ Lib \ runpy.py”,第96行,_run_module_code mod_name,mod_spec,pkg_name,script_name) 文件“ C:\ Users \ barte \ AppData \ Local \ Programs \ Python \ Python36 \ Lib \ runpy.py”,第85行,_run_code exec(代码,run_globals) 文件“ c:\ Users \ barte \ Desktop \ Projects \ tf \ adullt UCI数据集\ model.py”,第93行,在 数据[标签] = StandardScaler()。fit_transform(数据[标签]) 文件“ C:\ Users \ barte \ Desktop \ Projects \ tf \ env \ lib \ site-packages \ sklearn \ base.py”,第553行,在fit_transform中 返回self.fit(X,** fit_params).transform(X) 文件“ C:\ Users \ barte \ Desktop \ Projects \ tf \ env \ lib \ site-packages \ sklearn \ preprocessing \ data.py”,行639,适合
返回self.partial_fit(X,y) 文件“ C:\ Users \ barte \ Desktop \ Projects \ tf \ env \ lib \ site-packages \ sklearn \ preprocessing \ data.py”,第663行,partial_fit force_all_finite ='allow-nan') 文件“ C:\ Users \ barte \ Desktop \ Projects \ tf \ env \ lib \ site-packages \ sklearn \ utils \ validation.py”,第496行,位于check_array
array = np.asarray(array,dtype = dtype,order = order) 文件“ C:\ Users \ barte \ Desktop \ Projects \ tf \ env \ lib \ site-packages \ numpy \ core_asarray.py”,行85,格式为 返回数组(a,dtype,copy = False,order = order) 文件“ C:\ Users \ barte \ Desktop \ Projects \ tf \ env \ lib \ site-packages \ pandas \ core \ series.py”,第948行,位于数组中 返回np.asarray(self.array,dtype) 文件“ C:\ Users \ barte \ Desktop \ Projects \ tf \ env \ lib \ site-packages \ numpy \ core_asarray.py”,行85,格式为 返回数组(a,dtype,copy = False,order = order) 文件“ C:\ Users \ barte \ Desktop \ Projects \ tf \ env \ lib \ site-packages \ pandas \ core \ arrays \ numpy_.py”,行166,位于 array
返回np.asarray(self._ndarray,dtype = dtype) 文件“ C:\ Users \ barte \ Desktop \ Projects \ tf \ env \ lib \ site-packages \ numpy \ core_asarray.py”,行85,格式为 返回数组(a,dtype,copy = False,order = order) ValueError:无法将字符串转换为float:'Other-service
label = 'occupation'
temp_values = data[[label,'50']].groupby(label).mean()
temp_values = temp_values.to_dict()['50']
print(temp_values)
for index,row in enumerate(data[label]):
data[label][index] = temp_values[row]
data[label] = StandardScaler().transform(data[label])
print(data[label])
只是: 打印(数据[标签]) 给出:
0 0.133835
1 0.48522
2 0.0614815
3 0.0614815
4 0.448489
...
30102 0.124619
30105 0.0410959
30110 0.448489
30156 0.326087
30158 0.124619
我正在使用此数据集https://archive.ics.uci.edu/ml/datasets/Adult
感谢帮助