Question

我想用NaN替换DataFrame列中的所有数值

输入

A       B       C
test    foo     xyz
hit     bar     10
hit     fish    90
hit     NaN     abc
test    val     20
test    val     90

所需的输出：

A       B       C
test    foo     xyz
hit     bar     NaN
hit     fish    NaN
hit     NaN     abc
test    val     NaN
test    val     NaN

我尝试了以下操作：

db_old.loc[db_old['Current Value'].istype(float), db_old['Current Value']] = np.nan

但返回：

AttributeError：“系列”对象没有属性“ istype”

有什么建议吗？

谢谢

Answer 1

您可以使用to_numeric屏蔽数字值：

df['C'] = df['C'].mask(pd.to_numeric(df['C'], errors='coerce').notna())
df
      A     B    C
0  test   foo  xyz
1   hit   bar  NaN
2   hit  fish  NaN
3   hit   NaN  abc
4  test   val  NaN
5  test   val  NaN

to_numeric是最通用的解决方案，无论您有一列字符串还是混合对象，它都应该起作用。

如果它是一列字符串，而您只想保留字母字符串，则str.isalpha可能就足够了：

df['C'] = df['C'].where(df['C'].str.isalpha())
df
      A     B    C
0  test   foo  xyz
1   hit   bar  NaN
2   hit  fish  NaN
3   hit   NaN  abc
4  test   val  NaN
5  test   val  NaN

尽管这专门保留了没有数字的字符串。

如果您有一列混合对象，那么这是另一种使用str.match的{{1}}（实际上是带有na标志的str方法）的解决方案：

na=False

df['C'] = ['xyz', 10, 90, 'abc', 20, 90]

在Python中将数值替换为NaN

1 个答案: