我正在使用以下方法从HTTP服务器读取txt文件。
public static String getHtmlFromUrl(String strUrl, String referer, boolean isMobile) {
URL url = null;
BufferedReader reader = null;
StringBuilder sb = null;
String returnValue = "";
try {
url = new URL(strUrl);
URLConnection con = url.openConnection();
// force server to mimic specific Browser
con.setRequestProperty("User-Agent", userAgent);
if(isMobile)
con.setRequestProperty("User-Agent", userAgentMobile);
con.setRequestProperty("Referer", referer);
con.setReadTimeout(15000);
con.connect();
reader = new BufferedReader(new InputStreamReader(con.getInputStream()));
sb = new StringBuilder();
String line = null;
while((line = reader.readLine()) != null) {
sb.append(line + "\n");
}
returnValue = sb.toString();
} catch(Exception e) {
e.printStackTrace();
} finally {
if(reader != null) {
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return returnValue;
}
我没有直接访问此文件(所以我无法改变它的方式)。如果我在浏览器中调用URL,它将使用ISO-8859或Windows-1252编码正确显示。
Android似乎默认将其解释为utf-8。所以我需要一种方法将returnValue
或StringBuffer sb
从现有的ISO-8859编码转换为utf-8。
我该怎么做?
答案 0 :(得分:2)
你必须更新这一行:
reader = new BufferedReader(new InputStreamReader(con.getInputStream()));
需要:
reader = new BufferedReader(new InputStreamReader(url.getInputStream(), "ISO_8859_1"));
或者自Java 7以来:
reader = new BufferedReader(new InputStreamReader(url.getInputStream(), StandardCharsets.ISO_8859_1));
<强>更新强> ISO_8859_1代替UTF-8