Question

这不是好奇心的问题。

在我的64位linux解释器中，我可以执行

In [10]: np.int64 == np.int64
Out[10]: True

In [11]: np.int64 is np.int64
Out[11]: True

很好，正是我所期待的。但是我发现了numpy.core.numeric模块的这个奇怪的属性

In [19]: from numpy.core.numeric import _typelessdata

In [20]: _typelessdata
Out[20]: [numpy.int64, numpy.float64, numpy.complex128, numpy.int64]

奇怪为什么numpy.int64在那里两次？让我们调查。

In [23]: _typelessdata[0] is _typelessdata[-1]
Out[23]: False
In [24]: _typelessdata[0] == _typelessdata[-1]
Out[24]: False
In [25]: id(_typelessdata[-1])
Out[25]: 139990931572128
In [26]: id(_typelessdata[0])
Out[26]: 139990931572544
In [27]: _typelessdata[-1]
Out[27]: numpy.int64
In [28]: _typelessdata[0]
Out[28]: numpy.int64

哇，他们是不同的。这里发生了什么？为什么有两个np.int64？

Answer 1

Here是在_typelessdata内构建numeric.py的行：

_typelessdata = [int_, float_, complex_]
if issubclass(intc, int):
    _typelessdata.append(intc)

if issubclass(longlong, int):
    _typelessdata.append(longlong)

intc是C兼容（32位）有符号整数，int是本机Python 整数，取决于平台，可以是32位或64位。

在32位系统上，原生Python int类型也是32位，所以 issubclass(intc, int)返回True，intc附加到_typelessdata，最终看起来像这样：
```
[numpy.int32, numpy.float64, numpy.complex128, numpy.int32]
```
请注意_typelessdata[-1] is numpy.intc，而不是numpy.int32。
在64位系统上，int为64位，因此issubclass(longlong, int)返回True，longlong附加到_typelessdata，从而导致：
```
[numpy.int64, numpy.float64, numpy.complex128, numpy.int64]
```
在这种情况下，正如Joe指出的那样，(_typelessdata[-1] is numpy.longlong) == True。

更大的问题是为什么 _typelessdata的内容设置如下。我可以在_typelessdata所在的numpy源中找到的唯一地方在[{3}}的定义中实际使用的是this line 在同一个文件中：

skipdtype = (arr.dtype.type in _typelessdata) and arr.size > 0

_typelessdata的目的是确保np.array_repr正确打印其dtype恰好与（平台相关的）本机Python整数类型相同的数组的字符串表示形式。

例如，在32位系统上，int为32位：

In [1]: np.array_repr(np.intc([1]))
Out[1]: 'array([1])'

In [2]: np.array_repr(np.longlong([1]))
Out[2]: 'array([1], dtype=int64)'

而在64位系统上，int为64位：

In [1]: np.array_repr(np.intc([1]))
Out[1]: 'array([1], dtype=int32)'

In [2]: np.array_repr(np.longlong([1]))
Out[2]: 'array([1])'

上面一行中的arr.dtype.type in _typelessdata检查确保跳过dtype打印适当的平台相关原生整数dtypes。

Answer 2

我不知道背后的完整历史，但第二个int64实际上是numpy.longlong。

In [1]: import numpy as np

In [2]: from numpy.core.numeric import _typelessdata

In [3]: _typelessdata
Out[4]: [numpy.int64, numpy.float64, numpy.complex128, numpy.int64]

In [5]: id(_typelessdata[-1]) == id(np.longlong)
Out[5]: True

numpy.longlong应该是directly correspond to C's long long type。 C long long被指定为至少64位宽，但确切的定义由编译器决定。

我的猜测是numpy.longlong在大多数系统上都成为numpy.int64的另一个实例，但如果C编译器将long long定义为宽于64位的内容，则允许它是不同的

为什么numpy.core.numeric._typelessdata中有两个np.int64（为什么numpy.int64不是numpy.int64？）

2 个答案: