Question

我正在尝试在pandas中为DataFrame添加前缀。它应该很简单：

import pandas as pd
a=pd.DataFrame({
    'x':[1,2,3],
})
#this one works;
"mm"+a['x'].astype(str)
0    mm1
1    mm2
2    mm3
Name: x, dtype: object

但是令人惊讶的是，如果我想使用单个字母'm'的前缀，它将停止工作：

#this one doesn't work
"m"+a['x'].astype(str)
TypeError                                 Traceback (most recent call last)
<ipython-input-21-808db8051ebc> in <module>
      1 #this one doesn't work
----> 2 "m"+a['x'].astype(str)

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops\__init__.py in wrapper(left, right)
   1014             if is_scalar(right):
   1015                 # broadcast and wrap in a TimedeltaIndex
-> 1016                 assert np.isnat(right)
   1017                 right = np.broadcast_to(right, left.shape)
   1018                 right = pd.TimedeltaIndex(right)

TypeError: ufunc 'isnat' is only defined for datetime and timedelta.

所以我的问题是：

如何解决问题？
发生了什么事，似乎熊猫正在尝试做一些花哨的事情？
“ m”为什么这么特别？（似乎其他单个字母都可以，例如'b'）。

Answer 1

问题是"m"被解释为TimeDelta：

from pandas.core.dtypes.common import is_timedelta64_dtype

print(is_timedelta64_dtype("m"))

输出

True

执行此操作时将调用函数is_timedelta64_dtype：

res = "m" + a['x'].astype(str)

代码（熊猫）

elif is_timedelta64_dtype(right):
    # We should only get here with non-scalar or timedelta64('NaT')
    #  values for right
    # Note: we cannot use dispatch_to_index_op because
    #  that may incorrectly raise TypeError when we
    #  should get NullFrequencyError
    orig_right = right
    if is_scalar(right):
        # broadcast and wrap in a TimedeltaIndex
        assert np.isnat(right)
        right = np.broadcast_to(right, left.shape)
        right = pd.TimedeltaIndex(right)

鉴于该值也是一个标量，它将检查它是否为NaT，

assert np.isnat(right)

触发异常的原因是什么。一个简单的解决方法是将“ m”放在列表中：

res = ["m"] + a['x'].astype(str)
print(res)

输出

0    m1
1    m2
2    m3
Name: x, dtype: object

Answer 2

通过更改为解决问题：

import numpy as np
np.array('m')+a['x'].astype(str)

出于某种原因，熊猫认为这个“ m”标志着时间。请查看@Daniel Mesejo的解释

Answer 3

这似乎是python前端接口的问题。由于使用Spyder界面或Jupyter笔记本电脑时发生一些冲突，可能会发生这种情况。在Spyder上运行代码时遇到相同的错误。当我通过在命令行终端而不是SPYDER或Jupyter调用python使用相同的代码时，该问题得到解决。

尝试通过调用python命令在命令行终端中运行相同的代码，它应该可以正常工作。

在pandas列中添加前缀

3 个答案: