JAXB null元素使用UTF-16BE的文件过早结束

时间:2014-05-12 14:41:34

标签: java xml encoding utf-8 jaxb

我在UTF-16BE中解析一个简单的XML文档时遇到了很大的问题。 XML似乎正确且非常简单:

<?xml version="1.0"?>
<ACTCDOC xmlns="http://www.cip-bancos.org.br/ARQ/ACTC101PRO.xsd">
    <BCARQ>
        <NomArq>ACTC101_00360305_20140508_00010_PRO</NomArq>
        <NumCtrlEmis>20140508000000000715</NumCtrlEmis>
        <NumCtrlDestOr>10</NumCtrlDestOr>
        <ISPBEmissor>02992335</ISPBEmissor>
        <ISPBDestinatario>00360305</ISPBDestinatario>
        <DtHrArq>2014-05-08T21:31:10</DtHrArq>
        <DtRef>2014-05-08</DtRef>
    </BCARQ>
</ACTCDOC>

我正在尝试使用以下代码进行解析:

@SuppressWarnings("unchecked")
public static <T> T lerXML(Class<T> clazz, InputStream in) throws JAXBException {
    JAXBContext jaxbContext = JAXBContext.newInstance(clazz);
    Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();

    T ret = (T) jaxbUnmarshaller.unmarshal(in);

    return ret;
}

我的域名:

@XmlRootElement(name="ACTCDOC", namespace="http://www.cip-bancos.org.br/ARQ/ACTC101PRO.xsd")
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "ACTCDOCPROComplexType", propOrder = {
    "bcarq"
})
public class ACTCDOCPROComplexType {

    @XmlElement(name = "BCARQ", required = true)
    protected BCARQComplexType bcarq;

    ... getter and setters
}

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "BCARQComplexType", propOrder = {
    "nomArq",
    "numCtrlEmis",
    "numCtrlDestOr",
    "ispbEmissor",
    "ispbDestinatario",
    "dtHrArq",
    "sitReq",
    "grupoSeq",
    "dtRef"
})
public class BCARQComplexType {

    @XmlElement(name = "NomArq", required = true)
    protected String nomArq;
    @XmlElement(name = "NumCtrlEmis", required = true)
    protected String numCtrlEmis;
    @XmlElement(name = "NumCtrlDestOr")
    protected String numCtrlDestOr;
    @XmlElement(name = "ISPBEmissor", required = true)
    protected String ispbEmissor;
    @XmlElement(name = "ISPBDestinatario", required = true)
    protected String ispbDestinatario;
    @XmlElement(name = "DtHrArq", required = true)
    @XmlJavaTypeAdapter(DataHoraAdaptador.class)
    protected XMLGregorianCalendar dtHrArq;
    @XmlElement(name = "SitReq")
    protected BigInteger sitReq;
    @XmlElement(name = "Grupo_Seq")
    protected GrupoSeqComplexType grupoSeq;
    @XmlElement(name = "DtRef", required = true)
    @XmlJavaTypeAdapter(DataAdaptador.class)
    protected XMLGregorianCalendar dtRef;

    .... getter and setters
}

当我解析InputStream并打印objetc时,BCARQ元素为null,请参阅下面的内容:

ACTCDOCPROComplexType doc = XMLUtil.lerXML(ACTCDOCPROComplexType.class, is);

System.out.println(doc.getBCARQ());

JAXB适用于UTF-16BE ??? 我尝试了其他解决方案:在Reader中转换原始InputStream并将UTF-16BE转换为UTF-8,但没有成功。代码如下:

public static <T> T lerXML(Class<T> clazz, InputStream in) throws JAXBException {
    JAXBContext jaxbContext = JAXBContext.newInstance(clazz);
    Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();

    try {
        StringBuilder buf = new StringBuilder();
        BufferedReader reader = new BufferedReader(new InputStreamReader(in));
        String linha;
        while ((linha = reader.readLine()) != null) {
        buf.append(linha+"\n");
        }

        CharsetDecoder decoder = Charset.forName("UTF-16BE").newDecoder();
        ByteBuffer bytes = ByteBuffer.wrap(buf.toString().getBytes());
        String xmlUTF8 = decoder.decode(bytes).toString();

        ByteArrayInputStream bis = new ByteArrayInputStream(xmlUTF8.getBytes());
        ret = (T) jaxbUnmarshaller.unmarshal(in);

        return ret;
    } catch (IOException e) {
        throw new JAXBException(e);
    }
}

但是在这种形式下我得到了错误:

[org.xml.sax.SAXParseException: Premature end of file.]
    at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(Unknown Source)
    at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)
    at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)
    at br.gov.caixa.sigec.util.XMLUtil.lerXML(XMLUtil.java:124)
    at br.gov.caixa.sigec.negocio.preprocessador.PreProcessadorACTC101PRO.main(PreProcessadorACTC101PRO.java:99)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    ... 6 more

有什么想法吗? THX

2 个答案:

答案 0 :(得分:1)

您的XML应在标头中包含编码。

<?xml version="1.0" encoding="UTF-16BE"?>

如果由于某种原因您无法对XML标头进行编码,那么您可以使用Reader尝试以下方式:

InputStream inputStream = new FileInputStream("input.xml");
Reader reader = new InputStreamReader(inputStream, "UTF-16BE");
Object result = unmarshaller.unmarshal(reader);

或者,尝试使用StAX XMLStreamReader解析XML,然后让Unmarshaller解组。

答案 1 :(得分:0)

问题解决了。

我不确切地知道这个问题,但我改变了我获取XML的形式。通过ByteArrayInputStream(byte [])传递一些解析器。现在我只创建一个ByteArrayInputStream,并且对于每个解析器,我在输入流中执行reset()。

工作正常,但很奇怪!!

感谢您的帮助和兴趣。