我有一个UTF-8的XML,并且有一些中文特殊字符,我需要解析这个xml。
DocumentBuilderFactory factory = DocumentBuilderFactory
.newInstance();
factory.setIgnoringElementContentWhitespace(true);
factory.setNamespaceAware(true);
factory.setValidating(true);
//byte[] buffer = xmlMsg.getBytes("UTF-16");
logger.info("transformToUTP " + xmlMsg);
//byte[] buffer = soapMessage.getBytes();
//ByteArrayInputStream stream = new ByteArrayInputStream(buffer);
InputSource is = new InputSource(new ByteArrayInputStream(
xmlMsg.getBytes("UTF-16")));
Document doc = factory.newDocumentBuilder().parse(is);
//Document doc = factory.newDocumentBuilder().parse(
new InputSource(new StringReader(xmlMsg)));
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setNamespaceContext(getNameSpace());
XPathExpression soapBodyExpr = xpath.compile(BODY_XPATH_EXP);
Node soapBody = (Node) soapBodyExpr.evaluate(doc,
XPathConstants.NODE);
Node reqMsgNode = soapBody.getFirstChild();
我在reqMsgNode上得到一个空指针异常。
答案 0 :(得分:1)
不要将xml转换为字符串,按原样解析,使用
DocummentBuilder.parse(File)
或DocumentBuilder.parse(InputStream)
解析器将从xml声明中获取编码,例如<?xml version="1.0" encoding="UTF-8"?>
,如果缺少,则默认使用UTF-8