Java XML Unmarshalling使用JAXB在&符号(&)上失败

时间:2010-06-08 16:09:53

标签: java xml jaxb unmarshalling

我有以下XML:

<?xml version="1.0" encoding="UTF-8"?>
<details>
  ...
  <address1>Test&amp;Address</address1>
  ...
</details>

当我尝试使用JAXB解组它时,它会抛出以下异常:

Caused by: org.xml.sax.SAXParseException: The reference to entity "Address" must end with the ';' delimiter.
        at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
        at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
        at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
        at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:194)

但是当我将XML中的&amp;更改为&apos;时,它可以正常工作。看起来问题只出现在&符号&amp;上,我无法理解为什么。

解组的代码是:

JAXBContext context = JAXBContext.newInstance("some.package.name", this.getClass().getClassLoader());
Unmarshaller unmarshaller = context.createUnmarshaller();
obj = unmarshaller.unmarshal(new StringReader(xml));

任何人都有一些见解?

编辑:我尝试了下面@ abhin4v建议的解决方案(即在&amp;之后添加一个空格),但它似乎也没有用。这是堆栈跟踪:

Caused by: org.xml.sax.SAXParseException: The entity name must immediately follow the '&' in the entity reference.
        at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
        at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
        at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
        at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:194)

3 个答案:

答案 0 :(得分:3)

Xerces将&amp;转换为&,然后尝试解析失败的&Address,因为它不以;结尾。 &Address之间放置一个空格,它应该可以正常工作。放置一个空格将无效,因为Xerces现在会尝试解析&并抛出第二个错误在OP中给出。您可以将测试包装在CDATA部分中,Xerces不会尝试解析实体。

答案 1 :(得分:3)

我也遇到过这种情况。第一遍我只是将&amp; amp替换为令牌字符串(AMPERSAND_TOKEN),通过JAXB发送,然后重新替换&符号。不理想,但这是一个快速修复。

第二遍我做了很多重大改变,所以我不确定究竟是什么解决了这个问题。我怀疑提供对html dtds的JAXB访问使它更加快乐,但这只是猜测,可能是我的项目特有的。

HTH

答案 2 :(得分:1)

事实证明问题是由于我正在使用的框架(Mentawai framework)。所述XML来自HTTP请求的POST主体。

显然,框架会转换XML正文中的字符实体,因此,&amp;变为&,而unmarshaller无法解组XML。