Question

我有一个系列，其中包含金额作为字符串，其中一些格式错误，我想清理它以将其转换为浮点数。

尤其是条目($25.0)。我想先删除($，然后再删除)。

我从以下代码开始：

s = s.str.replace('($', '', regex = False)

字符串'($'被删除，但随后某些行变为NaN：

之前：

442452        992
442453       2415
442454      177.5
442455      32457
442456    4714.07
Name: Amount, dtype: object

之后：

442452    NaN
442453    NaN
442454    NaN
442455    NaN
442456    NaN
Name: Amount, dtype: object

您能解释这种行为吗？您会提出什么建议？

Answer 1

NaN行很可能不是字符串，它们可能是数字类型或其他类型。尝试检查type返回NaN的行，如下所示

s[s.str.replace('($', '', regex=False).isna()].map(type)

如果您的问题完全符合我的猜测，则解决方案是将整个s转换为replace之前的字符串

s.astype(str).str.replace('($', '', regex=False)

或在替换后链接加号fillna，以放回这些数字值

s.str.replace('($', '', regex=False).fillna(s)

例如下面的示例将创建一系列dtype对象，其中第一行是整数，其余的是字符串。

s = pd.Series([123, '($2.0)', 'B03', 'D04'])

Out[416]:
0       123
1    ($2.0)
2       B03
3       D04
dtype: object

str.replace将第一行转到NaN

s.str.replace('($', '', regex=False)

Out[417]:
0     NaN
1    2.0)
2     B03
3     D04
dtype: object

使用map(type)检查显示为整数的NaN行数据类型

s[s.str.replace('($', '', regex=False).isna()].map(type)

Out[418]:
0    <class 'int'>
dtype: object