我遇到以下与从DOM转换为String时转义XML内容有关的问题。
存在包含Base64编码数据的XML文件:
<RootElement>
<SomeData />
<Target>PD94bWwgdmVyc2lvbj0iMS4wIitruncated=</Target>
</RootElement>
标记内容是另一种刚编码为Base64格式的XML。 解码后,它变成了如下所示的XML结构:
<Data>
<Info>Some valid text</Info>
</Data>
现在我需要用其解码值替换tag的内容:
// parse initial XML document that contains Base64-encoded data
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = null;
docBuilder = docBuilderFactory.newDocumentBuilder();
org.w3c.dom.Document doc = docBuilder.parse(new ByteArrayInputStream((xml.getBytes())));
// create XPath expression in order to find target tag
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression xPathExpression = xpath.compile("RootElement/Target");
// find target node using XPath, get its content and decode it
NodeList nodes = (NodeList) xPathExpression.evaluate(doc, XPathConstants.NODESET);
Node targetNode = nodes.item(0);
String base64Content = targetNode.getTextContent();
BASE64Decoder base64Decoder = new BASE64Decoder();
byte[] decodedBytes = base64Decoder.decodeBuffer(base64Content);
// all is OK here, value is decoded correctly without escaping
String decodedContent = new String(decodedBytes);
// setting new decoded content (in XML format) to node that previously contained Base64 encoded content
targetNode.setTextContent(decodedContent);
// transforming DOM structure to java.lang.String
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(doc), new StreamResult(writer));
// creating String from StringWriter in order to check result
String output = writer.getBuffer().toString();
但是转换后我得到了解码数据转义:
<RootElement>
<SomeData />
<Target>
<Data>
<Info>Some text</Info>
</Data>
</Target>
</RootElement>
有没有办法防止转换过程中出现这种情况?