我有一个用XMLStreamReader解析的xml,但是它有一些html实体(不是xml标准的一部分),比如重音字符使next()
方法抛出XMLStreamException: The entity "uacute" was referenced, but not declared.
我尝试添加一个实现XMLResolver
(https://docs.oracle.com/javase/8/docs/api/javax/xml/stream/XMLResolver.html)的自定义实体解析器,该文档声明:
如果应用程序希望执行自定义实体解析,则必须使用setXMLResolver方法向XMLInputFactory注册此接口的实例。
好的,所以我让这个类重现错误:
private void testXMLResolver() throws XMLStreamException {
String xml = "<example>You know ó is an accented character</example>";
XMLInputFactory inputFactory = XMLInputFactory.newInstance(); // instantiate XMLInputFactory
inputFactory.setXMLResolver(new MyEntityResolver()); // Append custom entity resolver
XMLStreamReader xmlStreamReader = inputFactory.createXMLStreamReader(new ByteArrayInputStream(xml.getBytes())); // create XMLStreamReader for the xml
xmlStreamReader.next(); // reads <example>
xmlStreamReader.next(); // reads the text inside <example> tag
System.out.println("Text is: " + xmlStreamReader.getText());
xmlStreamReader.next();
}
class MyEntityResolver implements XMLResolver {
@Override
public Object resolveEntity(String publicID, String systemID, String baseURI, String namespace) throws XMLStreamException {
return new ByteArrayInputStream("huehey!!".getBytes());
}
}
}
首先执行testXMLResolver()
输出:
文字是:你知道
然后,当执行最后一个next()
时,它会抛出异常
Exception in thread "main" javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,27] Message: The entity "oacute" was referenced, but not declared. at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:601) at Test.testXMLResolver(Test.java:21) at Test.main(Test.java:10) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
首先:我不知道为什么MyXMLResolver没有解析实体
第二:为什么异常在最后一个next()
而不是之前的resolveEntity(...)
中被激活?因为文本是在前一个文本中解析的。
PS:我让InputStream
返回recentPageArray = {
"e49f8a67-3075-433a-bacd-30379008fdb2": {
"id": "e49f8a67-3075-433a-bacd-30379008fdb2",
"name": "afolder",
"type": "indexfolder"
},
"3a1ca419-5467-4662-9f7a-f3e9246a1d49": {
"id": "3a1ca419-5467-4662-9f7a-f3e9246a1d49",
"name": "folder1",
"type": "indexfolder"
},
"832f9d4e-297e-40e9-9189-75ce3b86341a": {
"id": "832f9d4e-297e-40e9-9189-75ce3b86341a",
"name": "afolder",
"type": "documentfolder"
},
"86fee1ce-21cd-4948-bf9d-a6c81b897a0e": {
"id": "86fee1ce-21cd-4948-bf9d-a6c81b897a0e",
"name": "afolder",
"type": "documentfolder"
}
}
,因为该方法的文档说明了:
检索资源。此资源可以是以下三种返回类型:(1)java.io.InputStream(2)javax.xml.stream.XMLStreamReader(3)java.xml.stream.XMLEventReader
答案 0 :(得分:0)
不是因为你没有宣布ó(“uacute”)吗? 这样的事情可能是:
<!DOCTYPE definition [<!ENTITY oacute "ó">]>