Question

我正在使用来自数据库的数据创建一个csv文件，并将其编码为UTF-16LE以获取特殊字符，例如e`.But，而我正在尝试用Java读取相同的文件，如：

BufferedReader br = new BufferedReader(new InputStreamReader(
fileContent, "utf16"));

我没有数据。

如果我在读取输入流时使用UTF-8编码，如下所示：

BufferedReader br = new BufferedReader(new InputStreamReader(
fileContent, "utf8"));

使用Buffered reader我收到了所有数据，但特殊字符来自：

Brut¿l¿

它应该是Brutélé。

如何使用UTF-16在java中获取数据？我已经在我的Java代码中尝试过使用UTF-16LE和ANSI。 ANSI正在提供未处理的异常，16LE没有任何区别。

以下是导出文件的代码：

`

    OutputStream outStream = null;
    InputStream inputStream = null;
    final int BUFFER_SIZE =33554432;

    try {

        inputStream = new ByteArrayInputStream(input.getBytes("UTF-16LE"));

        System.out.println("outStream = " + outStream);

        byte[] buffer = new byte[BUFFER_SIZE];
        int bytesRead = -1;
        if (inputStream != null)
            try {
                while ((bytesRead = inputStream.read(buffer)) != -1) {
                    outStream.write(buffer, 0, bytesRead);

                    if (outStream != null)

                        outStream.close();
                }
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }

    } catch (UnsupportedEncodingException e1) {
        // TODO Auto-generated catch block
        e1.printStackTrace();
    }`

Answer 1

正如@John Skeet已经说过的那样。字节序列42 72 75 74 E9 6C E9不是UTF，它是ISO_8859_1。

您可以使用以下代码段验证它

byte[] b = {0x42, 0x72, 0x75, 0x74, (byte) 0xE9, 0x6C, (byte) 0xe9};
System.out.println("ISO_8859_1: " 
        + new String(b, StandardCharsets.ISO_8859_1));
System.out.println("UTF_8     : " 
        + new String(b, StandardCharsets.UTF_8));
System.out.println("UTF_16LE  : " 
        + new String(b, StandardCharsets.UTF_16LE));

输出（在Unicode感知控制台上）

ISO_8859_1: Brutélé
UTF_8     : Brut�l�
UTF_16LE  : 牂瑵泩�

Answer 2

您可以使用不合适的编码类型。这是正确的字符集类型Charset

在java中读取UTF-16文件不会提供任何数据

2 个答案: