Question

我正在尝试从包含阿拉伯字符的DOM编码XML解析文件中获取UTF-8元素。下面的方法采用解析的xml字符串，并且应该返回Document。

这是指向xml的链接：

http://212.12.165.44:7201/UniNews121.xml

public Document getDomElement(String xml){

    Document doc = null;
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

    try {

        DocumentBuilder db = dbf.newDocumentBuilder();
        InputSource is = new InputSource();
        StringReader xmlstring=new StringReader(xml);
        is.setCharacterStream(xmlstring);
        is.setEncoding("UTF-8");
                    //APP CRASHES HERE
        doc = db.parse(is); 

    } catch (ParserConfigurationException e) {
        Log.e("Error: ", e.getMessage());
        return null;
    } catch (SAXException e) {
        Log.e("Error: ", e.getMessage());
        return null;
    } catch (IOException e) {
        Log.e("Error: ", e.getMessage());
        return null;
    }
    // return DOM
    return doc;
}

错误：

09-18 13:36:20.031: E/Error:(3846): Unexpected token (position:TEXT xml version="1.0...@2:1 in java.io.InputStreamReader@4144ac08)

感谢您的帮助，但请具体说明您的答案

Answer 1

它发生了很多次，你应该仔细检查你正在打开的文件的编码。我建议你用手工设置编码的文件的本地副本来测试它。

从XML解析的String中形成DOM元素时出错

1 个答案: