Question

对于字符串和整数，我遇到了pandas .replace()函数的潜在错误行为。如果数据帧同时具有0（整数）和＆＃39; 0＆＃39; （字符串）然后替换＆＃39; 0＆＃39;影响字符串和整数。这是怎么回事：

In [1]: df = pd.DataFrame({'numbers' : [0, 1, 2, 0], 'strings' : ['0', 1, 2, '0']})

要检查它确实是正确的设置：

In [2]: df.dtypes
Out [2]:
numbers     int64
strings    object
dtype: object

并检查个别值：

In [3]: type(df['numbers'][0])
Out[3]: numpy.int64
In [4]: type(df['strings'][0])
Out[4]: str

现在，请替换：

In [5]: df.replace(to_replace='0', value=np.NaN, inplace=True)
In [6]: df.head()
Out[6]: 
   numbers  strings
0      NaN      NaN
1        1        1
2        2        2
3      NaN      NaN

正如您所看到的，它替换了字符串和整数，但它应该只对字符串有效。如果我们在整数上尝试相同，它可以正常工作：

In [7]: df = pd.DataFrame({'numbers' : [0, 1, 2, 0], 'strings' : ['0', 1, 2, '0']})
...: df.replace(to_replace=0, value=np.NaN, inplace=True)
...: print df.head()
Out [7]:   
   numbers strings
0      NaN       0
1        1       1
2        2       2
3      NaN       0

这是正确的行为还是应该报告错误？我使用pandas 0.19.0。

更新：Bug报告并确认。 @ nickil-maveli提供了同时适用的解决方法： df.replace(to_replace=['0'], value=[np.NaN], inplace=True)

Answer 1

Bug由开发人员报告并确认。 @ nickil-maveli提供了一个同时适用的解决方法：df.replace(to_replace=['0'], value=[np.NaN], inplace=True)

Pandas替换为字符串和整数 - 行为不正确？

1 个答案: