Question

R中有一个compare包，其中包含一个相同名称的函数，用于比较几个方面的两个数据集。因此，您可以为compare()函数设置不同的参数，从而定义df2与df1的偏差应该被接受。因此，如果你有两个相同的数据集，但是由于覆盖了更长的时间段，那个数据集更长，所以你可以设置short=TRUE。可能的参数列表如下：

compare(model, comparison,
        equal = TRUE,
        coerce = allowAll,
        shorten = allowAll,
        ignoreOrder = allowAll,
        ignoreNameCase = allowAll,
        ignoreNames = allowAll,
        ignoreAttrs = allowAll,
        round = FALSE,
        ignoreCase = allowAll,
        trim = allowAll,
        dropLevels = allowAll,
        ignoreLevelOrder = allowAll,
        ignoreDimOrder = allowAll,
        ignoreColOrder = allowAll,
        ignoreComponentOrder = allowAll,
        colsOnly = !allowAll,
        allowAll = FALSE)

我想知道Python是否有类似的包？所以我需要一个可以处理差异和不同情况的函数/包。两个数据集的相等性。我还没有找到一个类似于R的比较函数。

与我发现的最相似的是assert_frame_equal(df, expected, check_names=False)函数，它不像compare（）那样广泛。

Answer 1

经过数月的Python使用后，我可以回答自己的问题。

比较两个数组使用 numpy 包中的函数。

numpy.array_equal(array1,array2)  # test if the same shape and values
numpy.array_equiv(array1,array2)  # test if shape consistent and all elements equal
numpy.allclose(array1,array2) # test if same shape, elements have close values

可以找到这些功能的文档here。

这就是我现在需要的东西，但也许对某些人来说可能很有用：用于比较两个文本序列使用 difflib 包，这有助于比较HTML，.txt文件等。可以阅读文档here。

Python - 在几种情况下比较两个数据帧的相等性

1 个答案: