无法从java中的url读取阿拉伯语文本

时间:2013-12-04 12:32:40

标签: java http arabic

我有一个打印以下文字的网络服务。

    [{"packid":"p101","title":"صفته 1","description":"شسیب: 1\r\nثق س: 50","linkfuntext":"funtext","linkshortstory":"short","linkfunpic":"pic","linkringtone":"ring","linkfungif":"gif","linkwallpaper":"wall","price":"500","buyid":"pack.fun.1","buyed":""},{"packid":"p102","title":"بسته صدا","description":" متن ها: 50\r\nصداها: 120\r\nتصاویر: 100\r\nتصاویر متحرک: 50\r\nداستان کوتاه: 20","linkfuntext":"","linkshortstory":"","linkfunpic":"","linkringtone":"","linkfungif":"","linkwallpaper":"","price":"1200","buyid":"fun.pack.2","buyed":""}]

当我尝试在java中阅读它时,我会收到以下格式

[{"packid":"p101","title":"صفته 1","description":"شسیب: 1\r\nثق س: 50","linkfuntext":"funtext","linkshortstory":"short","linkfunpic":"pic","linkringtone":"ring","linkfungif":"gif","linkwallpaper":"wall","price":"500","buyid":"pack.fun.1","buyed":""},{"packid":"p102","title":"بسته صدا","description":" متن ها: 50\r\nصداها: 120\r\nتصاویر: 100\r\nتصاویر متحرک: 50\r\nداستان کوتاه: 20","linkfuntext":"","linkshortstory":"","linkfunpic":"","linkringtone":"","linkfungif":"","linkwallpaper":"","price":"1200","buyid":"fun.pack.2","buyed":""}]

我尝试将字符集更改为UTF-8以及ISO-8859-6,但仍然没有运气。当我在控制台上打印文本时,它被正确打印,这意味着在eclipse或控制台的字符集中没有问题。此外,我已经尝试更改存储文本的字符串的字符集,但同样的问题。

String serverOutput = new String(TEXT_FROM_SERVER.getBytes(), "UTF-8"); 

以下是我从Web服务

输出的代码
 HttpEntity entity = response.getEntity();    
 InputStream is = entity.getContent();
 String serverOutput = convertStreamToString(is);

 private String convertStreamToString(InputStream is) {
        Reader rd = null;
        BufferedReader reader = null;
        try { 

            rd = new InputStreamReader(is,"UTF-8");

        } catch (UnsupportedEncodingException e1) {
            // TODO Auto-generated catch block
            e1.printStackTrace();
        }
        reader = new BufferedReader(rd);
        StringBuilder sb = new StringBuilder();

        String line = null;
        try {
            while ((line = reader.readLine()) != null) {
                sb.append((line + "\n"));
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                is.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        return sb.toString();
    }

任何形式的帮助将不胜感激。感谢

1 个答案:

答案 0 :(得分:0)

你需要浏览那些HTML字符,你可以使用Apache Commons Lang中的一个名为unescapeHtml的方法来实现。更多信息here

示例:

String afterDecoding = StringEscapeUtils.unescapeHtml(beforeDecoding);