将InputStream读取为UTF-8

时间:2011-02-11 01:17:34

标签: java utf-8 inputstream

我正试图通过互联网逐行读取text/plain文件。我现在的代码是:

URL url = new URL("http://kuehldesign.net/test.txt");
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
LinkedList<String> lines = new LinkedList();
String readLine;

while ((readLine = in.readLine()) != null) {
    lines.add(readLine);
}

for (String line : lines) {
    out.println("> " + line);
}

文件test.txt包含¡Hélló!,我正在使用该文件来测试编码。

当我查看OutputStreamout)时,我将其视为> ¬°H√©ll√≥!。我不相信这是OutputStream的问题,因为我可以out.println("é");没有问题。

阅读的任何想法都将InputStream形成为UTF-8?谢谢!

3 个答案:

答案 0 :(得分:168)

解决了我自己的问题。这一行:

BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));

需要:

BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8"));

或者自Java 7以来:

BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(), StandardCharsets.UTF_8));

答案 1 :(得分:14)

String file = "";

try {

    InputStream is = new FileInputStream(filename);
    String UTF8 = "utf8";
    int BUFFER_SIZE = 8192;

    BufferedReader br = new BufferedReader(new InputStreamReader(is,
            UTF8), BUFFER_SIZE);
    String str;
    while ((str = br.readLine()) != null) {
        file += str;
    }
} catch (Exception e) {

}

试试这个,..: - )

答案 2 :(得分:4)

每次发现一个特殊字符标记为��时,我都会遇到同样的问题。为了解决这个问题,我尝试使用编码:ISO-8859-1

BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream("txtPath"),"ISO-8859-1"));

while ((line = br.readLine()) != null) {

}

我希望这可以帮助任何看过这篇文章的人。