Question

import org.jdom2.Document;
import org.jdom2.input.SAXBuilder;
import java.io.FileReader;

public class Test1 {

    @org.junit.Test
    public void main() throws Exception {
        SAXBuilder sax = new SAXBuilder();
        Document doc = sax.build(new FileReader("resources/file.xml"));
        System.out.println(doc.getRootElement().getText());
    }
}

file.xml包含：<root>©</root>编码为UTF-8。

使用了libs jdom2-2.06，hamcrest-core-1.3，junit-4.11。

当我在IntelliJ输出中运行时，这样：©。

当我在NetBeans中运行时，输出是这样的：Â©。

如果我将代码放到public static void main并运行它 - 一切正常。

如果我将FileReader更改为FileInputStream - 一切正常。

如果我将FileReader更改为StringReader("<root>©</root>") - 一切正常。

它有什么用？

Answer 1

在读取文件时没有指定字符集，因此它使用JVM默认值，从IntelliJ运行的afaik通常默认为UTF-8，而Eclipse（至少在Windows上）默认为默认的非unicode字符集（例如西欧的Cp1252）。

您需要明确，如FileReader的文档中所述：

此类的构造函数假定为默认字符编码和默认的字节缓冲区大小是合适的。要指定这些值自己，在一个上构造一个InputStreamReader 的FileInputStream。

换句话说：

new InputStreamReader(new FileInputStream("resources/file.xml"), StandardCharsets.UTF_8)

或者，让SAXBuilder为您处理此问题，然后给它InputStream。我相信 - 但我不是100％肯定 - 这将决定XML声明的字符集：

sax.build(new FileInputStream("resources/file.xml"))

Junit Test在不同IDE中的不同结果

1 个答案: