从Java1.2和Apache Xerces DomParser升级到Java1.7和Xerces JAXP DocumentBuilder后,尽管使用“setCoalescing(true);”初始化DocumentBuilderFactory,升级的解析完成但没有错误但不“解包”CDATA元素。 / p>
即,<ITEMDESC><![CDATA[ Sales Bom Material,Dist]]></ITEMDESC>
等输入XML元素未经修改即返回。
代码如下所示。
我是XML解析的新手,所以我很可能会遗漏一些非常基本的东西。
我们的输入XML实际上有数百种不同的标签,所以我们想要一个无需更改每个元素“get”的解决方案。
是否有其他要求/提示/技巧/窍门让“setCoalescing(true);”工作?
提前感谢任何建议。
代码:
DocumentBuilderFactory aDocBuilderFactory = DocumentBuilderFactory.newInstance();
aDocBuilderFactory.setValidating(m_dtdValidate);
// Set to make sure that CDATA elements are automatically converted and collected into a single text element
aDocBuilderFactory.setCoalescing(true);
// Make sure that entity references are expanded, this includes the replacements for the reserved markup
// characters
aDocBuilderFactory.setExpandEntityReferences(true);
// Ignore comments as they won't contain information to be processed
aDocBuilderFactory.setIgnoringComments(true);
// Get a document builder
m_documentBuilder = aDocBuilderFactory.newDocumentBuilder();
// Install entity resolver if required
m_documentBuilder.setEntityResolver(new DocumentEntityResolver());
m_document = m_documentBuilder.parse(pSource);