我有一个数据集,其中一列称为“ YearMade”,其类型为int64。我正在尝试替换“ YearMade”列中的值,其中任何小于1918的值都将替换为该列的中位数。所以我尝试:
df.where(df['YearMade'] > 1918, df['YearMade'].median(), inplace = True)
但是,我遇到类型错误。我在这里做什么错?而且,如何纠正呢?请参阅下面的错误消息:
<ipython-input-83-b202aa389b1d> in <module>
1 # We replace all the rows before 1929 with the median
2
----> 3 df.where(df['YearMade'] > 1918, df['YearMade'].median(), inplace = True)
4 df['YearMade'].describe()
/opt/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in where(self, cond, other, inplace, axis, level, errors, try_cast)
9274 other = com.apply_if_callable(other, self)
9275 return self._where(
-> 9276 cond, other, inplace, axis, level, errors=errors, try_cast=try_cast
9277 )
9278
/opt/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in _where(self, cond, other, inplace, axis, level, errors, try_cast)
9103 # reconstruct the block manager
9104
-> 9105 self._check_inplace_setting(other)
9106 new_data = self._data.putmask(
9107 mask=cond,
/opt/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in _check_inplace_setting(self, value)
5303
5304 raise TypeError(
-> 5305 "Cannot do inplace boolean setting on "
5306 "mixed-types with a non np.nan value"
5307 )
TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value```
答案 0 :(得分:0)
IIUC,您只想替换特定列的值。我认为您会收到错误,因为where条件应用于所有具有dtypes混合的列。 试试这个:
df['YearMade'].where(df['YearMade'] > 1918, df['YearMade'].median(), inplace = True)