Java - 使用java.Scanner看起来相同的文件,但不被评估为“相等”

时间:2012-09-06 18:48:49

标签: java encoding diff filereader

我已经设置了一个JUnit测试,该测试正在测试一个名为copy(File src, File dest)的方法,它只是将src文件的内容复制到dest文件中。我使用扫描仪同时迭代每个文件(当然是两个不同的扫描仪),然后将每个扫描仪next().equals()进行比较。

此测试失败,告诉我文件不相等。但这怎么可能呢?打印时字符串看起来完全一致,更不用说在调用hex dump之后我做了copy()个文件,看起来也是一样的。但是,当我以字节为单位打印next()的每个值时,我的确会得到不同的字节模式。我很困惑为什么会发生这种情况,以及我可以对我的代码进行哪些更改来解释这个问题?

我的想法是它与文件的编码有关,也许用于创建文件的编码方法与程序中其他地方copy()使用的编码方法不同?真的不太确定,任何帮助表示赞赏!以下是我正在为测试单元工作的内容:

// The @Rule and @Before blocks are used as set up helper methods for @Test.
    @Rule
    public TemporaryFolder tmp = new TemporaryFolder();

    private File f1, f2;

    @Before
    public void createTestData() throws IOException {
        f1 = tmp.newFile("src.txt");
        f2 = tmp.newFile("dest.txt");

        BufferedWriter out = new BufferedWriter(new FileWriter(f1));
        out.write("This should generate some " +
                "test data that will be used in " +
                "the following method.");
        out.close();
    }

    @Test
    public void copyFileTest() throws FileNotFoundException, 
    Exception {
        try {
            copyFile(f1, f2);
        } catch (IOException e) {
            e.getMessage();
            e.printStackTrace();
        }

        Scanner s1 = new Scanner(f1);
        Scanner s2 = new Scanner(f2);

        // FileReader is only used for debugging, to make sure the character
        // encoding is the same for both files.
        FileReader file1 = new FileReader(f1);
        FileReader file2 = new FileReader(f2);
        out.println("file 1 encoding: " +file1.getEncoding());
        out.println("file 2 encoding: " +file2.getEncoding());

        while (s1.hasNext() && s2.hasNext()) {
            String original = s1.next();
            String copy = s2.next();

            // These print out to be the same ...
            out.println("\ns1: " +original);
            out.println("s2: " +copy);

            // Nevertheless, this comparison fails!
            // These calls to getBytes() return different values.
            if (!(s1.equals(s2))) {
                out.println("\nComparison failed!! \ns1 in bytes: " +original.getBytes()+ 
                        "\ns2 in bytes: " +copy.getBytes());
                fail("The files are not equal.");
            }
        }
    }

这是我的输出:

file 1 encoding: UTF8
file 2 encoding: UTF8

s1: This
s2: This

Comparison failed!! 
s1 in bytes: [B@16f5b392
s2 in bytes: [B@5ce04204

1 个答案:

答案 0 :(得分:4)

Scanner不会覆盖Object.equals(),因此它会比较引用,在您的情况下,引用不相同,因为您有两个单独的Scanner个对象。