查找两个numpy数组之间的差异

时间:2019-09-10 06:40:44

标签: python arrays numpy matrix

我有两个来自文本文件的数组。通过观察,它看起来完全一样。但是,当我测试两个数组的等效性时,它们会失败-在元素方面,形状方面等。我使用numpy测试回答here

这是两个matrices

import numpy as np

class TextMatrixAssertions(object):
    def assertArrayEqual(self, dataX, dataY):
        x = np.loadtxt(dataX)
        y = np.loadtxt(dataY)

        if not np.array_equal(x, y):
            raise Exception("array_equal fail.")

        if not np.array_equiv(x, y):
            raise Exception("array_equiv fail.")

        if not np.allclose(x, y):
            raise Exception("allclose fail.")

dataX = "MyMatrix.txt"
dataY = "MyMatrix2.txt"
test = TextMatrixAssertions()
test.assertArrayEqual(dataX, dataY)

我想知道两个数组之间是否确实存在差异,或者是什么原因导致了失败。

3 个答案:

答案 0 :(得分:2)

它们不相等,有54个不同的元素。

np.sum(x!=y)

54

要查找哪些元素不同,可以执行以下操作:

np.where(x!=y)


(array([  1,   5,   7,  11,  19,  24,  32,  48,  82,  92,  97, 111, 114,
        119, 128, 137, 138, 146, 153, 154, 162, 165, 170, 186, 188, 204,
        215, 246, 256, 276, 294, 300, 305, 316, 318, 333, 360, 361, 390,
        419, 420, 421, 423, 428, 429, 429, 439, 448, 460, 465, 467, 471,
        474, 487]),
 array([18, 18, 18, 17, 17, 16, 15, 12,  8,  6,  5,  4,  3,  3,  2,  1,  1,
        26,  0, 25, 24, 24, 24, 23, 22, 20, 20, 17, 16, 14, 11, 11, 11, 10,
        10,  9,  7,  7,  5,  1,  1,  1, 26,  1,  0, 25, 23, 21, 19, 18, 18,
        17, 17, 14]))

答案 1 :(得分:0)

您应该首先尝试使用更小,更简单的矩阵来测试代码。

例如:

import numpy as np
from io import StringIO



class TextMatrixAssertions(object):
    def assertArrayEqual(self, dataX, dataY):
        x = np.loadtxt(dataX)
        y = np.loadtxt(dataY)

        if not np.array_equal(x, y):
            raise Exception("array_equal fail.")

        if not np.array_equiv(x, y):
            raise Exception("array_equiv fail.")

        if not np.allclose(x, y):
            raise Exception("allclose fail.")

        return True

a = StringIO(u"0 1\n2 3")
b = StringIO(u"0 1\n2 3")
test = TextMatrixAssertions()
test.assertArrayEqual(a,b)

输出

True

因此,我想您的问题出在文件而不是代码上。您也可以尝试在x和y中加载相同的文件,然后查看输出。

要查看哪些元素不同,可以尝试使用not_equal

示例

a = StringIO(u"0 1\n2 3")
c = StringIO(u"0 1\n2 4")
x = np.loadtxt(a)
y = np.loadtxt(c)
np.not_equal(x,y)

Output

array([[False, False],
       [False,  True]])

答案 2 :(得分:0)

另一种解决方案。您可以看到不相等的元素的值。如果运行下面的代码,则您将看到具有nan值的元素不相等,从而导致引发异常。

import numpy as np

class TextMatrixAssertions(object):
    def assertArrayEqual(self, dataX, dataY):
        x = np.loadtxt(dataX)
        y = np.loadtxt(dataY)

        if not np.array_equal(x, y):
            not_equal_idx = np.where(x != y)
            for idx1, idx2 in zip(not_equal_idx[0],not_equal_idx[1]):
                print(x[idx1][idx2])
                print(y[idx1][idx2])
            raise Exception("array_equal fail.")

        if not np.array_equiv(x, y):
            raise Exception("array_equiv fail.")

        if not np.allclose(x, y):
            raise Exception("allclose fail.")

dataX = "MyMatrix.txt"
dataY = "MyMatrix2.txt"
test = TextMatrixAssertions()
test.assertArrayEqual(dataX, dataY)

输出:

nan
nan
nan
...
nan