Question

我有一些灰度图像数据（0-255）。根据NumPy dtype，我得到不同的点积结果。例如，x0和x1是相同的图片：

>>> x0
array([0, 0, 0, ..., 0, 0, 0], dtype=uint8)
>>> x1
array([0, 0, 0, ..., 0, 0, 0], dtype=uint8)
>>> (x0 == x1).all()
True
>>> np.dot(x0, x1)
133
>>> np.dot(x0.astype(np.float64), x1.astype(np.float64))
6750341.0

我知道第二个点积是正确的，因为它们是相同的图像，余弦距离应为0：

>>> from scipy.spatial import distance
>>> distance.cosine(x0, x1)
0.99998029729164795
>>> distance.cosine(x0.astype(np.float64), x1.astype(np.float64))
0.0

当然，点积应该适用于整数。对于小型阵列，它确实：

>>> v = np.array([1,2,3], dtype=np.uint8)
>>> v
array([1, 2, 3], dtype=uint8)
>>> np.dot(v, v)
14
>>> np.dot(v.astype(np.float64), v.astype(np.float64))
14.0
>>> distance.cosine(v, v)
0.0

发生了什么事。为什么dot产品根据dtype给出了不同的答案？

Answer 1

数据类型uint8限制为8位，因此它只能表示值0,1，...，255。您的点积溢出可用的值范围，因此只有最后8位是保持。最后8位包含值133.您可以验证：

6750341 % (2 ** 8) == 133
# True

NumPy的点积产生两种不同的结果，具体取决于数组dtype

1 个答案: