我有一些NaN值的数据,我想用imputer
填充NaN值。
from sklearn.preprocessing import Imputer
imp = Imputer(missing_values='NaN', strategy='mean', axis=1)
cleaned_data = imp.fit_transform(original_data)
到目前为止,我知道imputer
适用于整个列,如下所示:
Point1 Point2
S.No
2 NaN
1 NaN 4
2 NaN
NaN 4
2 2 NaN
NaN 4
应用imputer后数据如下:
Point1 Point2
S.No
2 2
1 1 4
2 2
1 4
2 2 2
1 4
但我希望imputer works索引名称为S.No
Point1 Point2
S.No
2 1.33
1 1.333 4
2 1.33
0.667 4
2 2 2.667
0.667 4
可以像这样实现imputer
或者python
DataFrame
上有<Border BorderBrush="#FF0B232F" BorderThickness="2" >
<TextBlock
Background="#FFCDCD5A"
Grid.Column="4"
Grid.Row="2"
TextWrapping="Wrap"
Width="214.8"
Height="261.4" >
</Border>
这样的替代方法。
答案 0 :(得分:0)
imp = Imputer(missing_values=np.NaN,strategy='mean',axis=1)
for S.No in range (start,end):
for col in list(Data.select_dtypes(include=['float']).columns):
Data[col][S.No] = imp.fit_transform(Data[col][S.No])