转换为UTF-8

时间:2017-03-07 06:43:38

标签: android character-encoding urlconnection

我正在使用以下方法从HTTP服务器读取txt文件。

public static String getHtmlFromUrl(String strUrl, String referer, boolean isMobile) {
    URL url = null;
    BufferedReader reader = null;
    StringBuilder sb = null;
    String returnValue = "";

    try {
        url = new URL(strUrl);
        URLConnection con = url.openConnection();

        // force server to mimic specific Browser
        con.setRequestProperty("User-Agent", userAgent);
        if(isMobile)
            con.setRequestProperty("User-Agent", userAgentMobile);

        con.setRequestProperty("Referer", referer);

        con.setReadTimeout(15000);
        con.connect();

        reader = new BufferedReader(new InputStreamReader(con.getInputStream()));
        sb = new StringBuilder();

        String line = null;
        while((line = reader.readLine()) != null) {
            sb.append(line + "\n");
        }
        returnValue = sb.toString();
    } catch(Exception e) {
        e.printStackTrace();
    } finally {
        if(reader != null) {
            try {
                reader.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
    return returnValue;
}

我没有直接访问此文件(所以我无法改变它的方式)。如果我在浏览器中调用URL,它将使用ISO-8859或Windows-1252编码正确显示。

Android似乎默认将其解释为utf-8。所以我需要一种方法将returnValueStringBuffer sb从现有的ISO-8859编码转换为utf-8。

我该怎么做?

1 个答案:

答案 0 :(得分:2)

你必须更新这一行:

reader = new BufferedReader(new InputStreamReader(con.getInputStream()));

需要:

reader = new BufferedReader(new InputStreamReader(url.getInputStream(), "ISO_8859_1"));

或者自Java 7以来:

reader = new BufferedReader(new InputStreamReader(url.getInputStream(), StandardCharsets.ISO_8859_1));

<强>更新 ISO_8859_1代替UTF-8