我在UTF-16BE中解析一个简单的XML文档时遇到了很大的问题。 XML似乎正确且非常简单:
<?xml version="1.0"?>
<ACTCDOC xmlns="http://www.cip-bancos.org.br/ARQ/ACTC101PRO.xsd">
<BCARQ>
<NomArq>ACTC101_00360305_20140508_00010_PRO</NomArq>
<NumCtrlEmis>20140508000000000715</NumCtrlEmis>
<NumCtrlDestOr>10</NumCtrlDestOr>
<ISPBEmissor>02992335</ISPBEmissor>
<ISPBDestinatario>00360305</ISPBDestinatario>
<DtHrArq>2014-05-08T21:31:10</DtHrArq>
<DtRef>2014-05-08</DtRef>
</BCARQ>
</ACTCDOC>
我正在尝试使用以下代码进行解析:
@SuppressWarnings("unchecked")
public static <T> T lerXML(Class<T> clazz, InputStream in) throws JAXBException {
JAXBContext jaxbContext = JAXBContext.newInstance(clazz);
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
T ret = (T) jaxbUnmarshaller.unmarshal(in);
return ret;
}
我的域名:
@XmlRootElement(name="ACTCDOC", namespace="http://www.cip-bancos.org.br/ARQ/ACTC101PRO.xsd")
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "ACTCDOCPROComplexType", propOrder = {
"bcarq"
})
public class ACTCDOCPROComplexType {
@XmlElement(name = "BCARQ", required = true)
protected BCARQComplexType bcarq;
... getter and setters
}
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "BCARQComplexType", propOrder = {
"nomArq",
"numCtrlEmis",
"numCtrlDestOr",
"ispbEmissor",
"ispbDestinatario",
"dtHrArq",
"sitReq",
"grupoSeq",
"dtRef"
})
public class BCARQComplexType {
@XmlElement(name = "NomArq", required = true)
protected String nomArq;
@XmlElement(name = "NumCtrlEmis", required = true)
protected String numCtrlEmis;
@XmlElement(name = "NumCtrlDestOr")
protected String numCtrlDestOr;
@XmlElement(name = "ISPBEmissor", required = true)
protected String ispbEmissor;
@XmlElement(name = "ISPBDestinatario", required = true)
protected String ispbDestinatario;
@XmlElement(name = "DtHrArq", required = true)
@XmlJavaTypeAdapter(DataHoraAdaptador.class)
protected XMLGregorianCalendar dtHrArq;
@XmlElement(name = "SitReq")
protected BigInteger sitReq;
@XmlElement(name = "Grupo_Seq")
protected GrupoSeqComplexType grupoSeq;
@XmlElement(name = "DtRef", required = true)
@XmlJavaTypeAdapter(DataAdaptador.class)
protected XMLGregorianCalendar dtRef;
.... getter and setters
}
当我解析InputStream并打印objetc时,BCARQ元素为null,请参阅下面的内容:
ACTCDOCPROComplexType doc = XMLUtil.lerXML(ACTCDOCPROComplexType.class, is);
System.out.println(doc.getBCARQ());
JAXB适用于UTF-16BE ??? 我尝试了其他解决方案:在Reader中转换原始InputStream并将UTF-16BE转换为UTF-8,但没有成功。代码如下:
public static <T> T lerXML(Class<T> clazz, InputStream in) throws JAXBException {
JAXBContext jaxbContext = JAXBContext.newInstance(clazz);
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
try {
StringBuilder buf = new StringBuilder();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
String linha;
while ((linha = reader.readLine()) != null) {
buf.append(linha+"\n");
}
CharsetDecoder decoder = Charset.forName("UTF-16BE").newDecoder();
ByteBuffer bytes = ByteBuffer.wrap(buf.toString().getBytes());
String xmlUTF8 = decoder.decode(bytes).toString();
ByteArrayInputStream bis = new ByteArrayInputStream(xmlUTF8.getBytes());
ret = (T) jaxbUnmarshaller.unmarshal(in);
return ret;
} catch (IOException e) {
throw new JAXBException(e);
}
}
但是在这种形式下我得到了错误:
[org.xml.sax.SAXParseException: Premature end of file.]
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(Unknown Source)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)
at br.gov.caixa.sigec.util.XMLUtil.lerXML(XMLUtil.java:124)
at br.gov.caixa.sigec.negocio.preprocessador.PreProcessadorACTC101PRO.main(PreProcessadorACTC101PRO.java:99)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
... 6 more
有什么想法吗? THX
答案 0 :(得分:1)
您的XML应在标头中包含编码。
<?xml version="1.0" encoding="UTF-16BE"?>
如果由于某种原因您无法对XML标头进行编码,那么您可以使用Reader
尝试以下方式:
InputStream inputStream = new FileInputStream("input.xml");
Reader reader = new InputStreamReader(inputStream, "UTF-16BE");
Object result = unmarshaller.unmarshal(reader);
或者,尝试使用StAX XMLStreamReader
解析XML,然后让Unmarshaller
解组。
答案 1 :(得分:0)
问题解决了。
我不确切地知道这个问题,但我改变了我获取XML的形式。通过ByteArrayInputStream(byte [])传递一些解析器。现在我只创建一个ByteArrayInputStream,并且对于每个解析器,我在输入流中执行reset()。
工作正常,但很奇怪!!
感谢您的帮助和兴趣。