HttpResponse中格式错误的阿拉伯语

时间:2012-09-18 12:58:55

标签: java android xml utf-8 httprequest

  

可能重复:
  Parsing an UTF-8 Encodded XML file

我正在解析UTF-8编码的XML文件,其中包含一些其他正常工作的阿拉伯字符,但不显示阿拉伯字符,一些奇怪的字符显示如下:

ÙرÙÙ

这是XML“http://212.12.165.44:7201/UniNews121.xml”文件解析

的链接

下面是代码

        public String getXmlFromUrl(String url) {

        try {
            return new AsyncTask<String, Void, String>() {
                @Override
                protected String doInBackground(String... params) {
                    //String xml = null;
                    try {

                        DefaultHttpClient httpClient = new DefaultHttpClient();
                        httpClient.getParams().setParameter(CoreProtocolPNames.HTTP_CONTENT_CHARSET,"UTF-8");
                        HttpGet httpPost = new HttpGet(params[0]);
                        HttpResponse httpResponse = httpClient.execute(httpPost);
                        HttpEntity httpEntity = httpResponse.getEntity();
                        xml = new String(EntityUtils.toString(httpEntity).getBytes(),"UTF-8");

                    } catch (Exception e) {
                        e.printStackTrace();
                    }

                                    //just to remove the BOM Element    
                    xml=xml.substring(3);

            //Here am printing the xml and the arabic chars are malformed                                                       
                                    Log.i("DEMO", xml);
                    return xml;

                }
            }.execute(url).get();
        } catch (InterruptedException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (ExecutionException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        return xml;
    }

请注意,没有错误发生且一切正常,只是阿拉伯语字符格式不正确。

感谢您的帮助,但请具体说明您的答案

1 个答案:

答案 0 :(得分:1)

xml = new String(EntityUtils.toString(httpEntity).getBytes(),"UTF-8");

没有做你想要的。 EntityUtils.toString()使用默认字符集,然后调用getBytes(),它在没有指定编码的情况下也使用平台编码,然后调用new String,它尝试将此byte []读作UTF-8字符串字节[]。

您只需致电

xml = EntityUtils.toString(httpEntity, "UTF-8");