XMLStreamReader的自定义实体解析器无法正常工作

时间:2017-07-07 12:49:02

标签: java xml xml-parsing

我有一个用XMLStreamReader解析的xml,但是它有一些html实体(不是xml标准的一部分),比如重音字符使next()方法抛出XMLStreamException: The entity "uacute" was referenced, but not declared.

我尝试添加一个实现XMLResolverhttps://docs.oracle.com/javase/8/docs/api/javax/xml/stream/XMLResolver.html)的自定义实体解析器,该文档声明:

  

如果应用程序希望执行自定义实体解析,则必须使用setXMLResolver方法向XMLInputFactory注册此接口的实例。

好的,所以我让这个类重现错误:

private void testXMLResolver() throws XMLStreamException {
    String xml = "<example>You know &oacute; is an accented character</example>";
    XMLInputFactory inputFactory = XMLInputFactory.newInstance(); // instantiate XMLInputFactory
    inputFactory.setXMLResolver(new MyEntityResolver()); // Append custom entity resolver
    XMLStreamReader xmlStreamReader = inputFactory.createXMLStreamReader(new ByteArrayInputStream(xml.getBytes())); // create XMLStreamReader for the xml
    xmlStreamReader.next(); // reads <example>
    xmlStreamReader.next(); // reads the text inside <example> tag
    System.out.println("Text is: " + xmlStreamReader.getText());
    xmlStreamReader.next();
}

class MyEntityResolver implements XMLResolver {
    @Override
    public Object resolveEntity(String publicID, String systemID, String baseURI, String namespace) throws XMLStreamException {
        return new ByteArrayInputStream("huehey!!".getBytes());
    }
}

}

首先执行testXMLResolver()输出:

  

文字是:你知道

然后,当执行最后一个next()时,它会抛出异常

Exception in thread "main" javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,27]
Message: The entity "oacute" was referenced, but not declared.
    at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:601)
    at Test.testXMLResolver(Test.java:21)
    at Test.main(Test.java:10)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)

首先:我不知道为什么MyXMLResolver没有解析实体

第二:为什么异常在最后一个next()而不是之前的resolveEntity(...)中被激活?因为文本是在前一个文本中解析的。

PS:我让InputStream返回recentPageArray = { "e49f8a67-3075-433a-bacd-30379008fdb2": { "id": "e49f8a67-3075-433a-bacd-30379008fdb2", "name": "afolder", "type": "indexfolder" }, "3a1ca419-5467-4662-9f7a-f3e9246a1d49": { "id": "3a1ca419-5467-4662-9f7a-f3e9246a1d49", "name": "folder1", "type": "indexfolder" }, "832f9d4e-297e-40e9-9189-75ce3b86341a": { "id": "832f9d4e-297e-40e9-9189-75ce3b86341a", "name": "afolder", "type": "documentfolder" }, "86fee1ce-21cd-4948-bf9d-a6c81b897a0e": { "id": "86fee1ce-21cd-4948-bf9d-a6c81b897a0e", "name": "afolder", "type": "documentfolder" } } ,因为该方法的文档说明了:

  

检索资源。此资源可以是以下三种返回类型:(1)java.io.InputStream(2)javax.xml.stream.XMLStreamReader(3)java.xml.stream.XMLEventReader

1 个答案:

答案 0 :(得分:0)

不是因为你没有宣布ó(“uacute”)吗? 这样的事情可能是:

<!DOCTYPE definition [<!ENTITY oacute "&#243;">]>