如何用UTF-8读取InputStream?

时间:2012-07-22 18:57:34

标签: java xml utf-8 inputstream

欢迎所有

我正在开发一个Java应用程序,它从Internet调用PHP,它给我一个XML响应。

在响应中包含这个词:“Próximo”,但是当我解析XML的节点并将响应转换为String变量时,我收到的字样如下:“Pr& oacute; ximo”。

我确定问题是我在Java应用程序中使用不同的编码然后编写PHP脚本。然后,我想我必须将编码设置为与PHP xml,UTF-8

相同

这是我用来从PHP中解析XML文件的代码。

¿我应该在此代码中更改以将编码设置为UTF-8? (请注意,我没有使用bufers阅读器,我正在使用输入流)

        InputStream in = null;
        String url = "http://www.myurl.com"
        try {                              
            URL formattedUrl = new URL(url); 
            URLConnection connection = formattedUrl.openConnection();   
            HttpURLConnection httpConnection = (HttpURLConnection) connection;
            httpConnection.setAllowUserInteraction(false);
            httpConnection.setInstanceFollowRedirects(true);
            httpConnection.setRequestMethod("GET");
            httpConnection.connect();               
            if (httpConnection.getResponseCode() == HttpURLConnection.HTTP_OK)
                in = httpConnection.getInputStream();   

            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();                     
            DocumentBuilder db = dbf.newDocumentBuilder();
            Document doc = db.parse(in);
            doc.getDocumentElement().normalize();             
            NodeList myNodes = doc.getElementsByTagName("myNode"); 

1 个答案:

答案 0 :(得分:7)

当您从InputStream读取byte[]时。创建字符串时,请将CharSet传递给“UTF-8”。例如:

byte[] buffer = new byte[contentLength];
int bytesRead = inputStream.read(buffer);
String page = new String(buffer, 0, bytesRead, "UTF-8");

注意,您可能希望使缓冲区的大小合适(如1024),并且不断调用inputStream.read(buffer)


@Amir Pashazadeh

是的,您也可以使用InputStreamReader,并尝试将parse()行更改为:

Document doc = db.parse(new InputSource(new InputStreamReader(in, "UTF-8")));