从文本解析时的未知字符

时间:2012-02-26 10:45:23

标签: android

我正在从网站上阅读一行文字。这是我读到的例子:

11:28;26.02.12;6.7°C;6.7°C;67;0.7m/s; 6:45;17:40; Warm ;84;0.9;0.0;;

一旦我读到字符串而不是6.7°C,我得到6.7 C。因为看起来这个网站不是UTF-8编码。我该如何解决这个问题呢?我会做出°而不是 ?是否有可能在阅读时解决这个问题,或者我可以在进行字符串拆分时解决这个问题?

以下是我用于从网站阅读的当前方法:

public static String getContentFromUrl(String url) throws ClientProtocolException, IOException {

    HttpClient httpClient = new DefaultHttpClient();
    HttpGet httpGet = new HttpGet(url);
    HttpResponse response;

    response = httpClient.execute(httpGet);
    HttpEntity entity = response.getEntity();

    if(entity != null) {

        InputStream inStream = entity.getContent();

        String result = Weather.convertStreamToString(inStream);
        inStream.close();

        return result;
    }

    return null;

}

private static String convertStreamToString(InputStream is) {
    BufferedReader reader = new BufferedReader(new InputStreamReader(is));
    StringBuilder sb = new StringBuilder();

    String line = null;

    try {
        while ((line = reader.readLine()) != null) {
            sb.append(line + "\n");
        }
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        try {
            is.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    return sb.toString();
}

1 个答案:

答案 0 :(得分:0)

服务器使用什么编码?你可以尝试:

sb.append((new String(line, "UTF-8")) + "\n");

sb.append((new String(line, "iso-8859-1")) + "\n");