XML - 防止在从DOM转换为String期间转义

时间:2017-04-19 08:53:52

标签: java xml-parsing

我遇到以下与从DOM转换为String时转义XML内容有关的问题。

存在包含Base64编码数据的XML文件:

<RootElement>
    <SomeData />
    <Target>PD94bWwgdmVyc2lvbj0iMS4wIitruncated=</Target>
</RootElement>

标记内容是另一种刚编码为Base64格式的XML。 解码后,它变成了如下所示的XML结构:

<Data>
    <Info>Some valid text</Info>
</Data>

现在我需要用其解码值替换tag的内容:

// parse initial XML document that contains Base64-encoded data
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = null;
docBuilder = docBuilderFactory.newDocumentBuilder();
org.w3c.dom.Document doc = docBuilder.parse(new ByteArrayInputStream((xml.getBytes())));

// create XPath expression in order to find target tag
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression xPathExpression = xpath.compile("RootElement/Target");

// find target node using XPath, get its content and decode it
NodeList nodes = (NodeList) xPathExpression.evaluate(doc, XPathConstants.NODESET);
Node targetNode = nodes.item(0);
String base64Content = targetNode.getTextContent();
BASE64Decoder base64Decoder = new BASE64Decoder();
byte[] decodedBytes = base64Decoder.decodeBuffer(base64Content);

// all is OK here, value is decoded correctly without escaping
String decodedContent = new String(decodedBytes);
// setting new decoded content (in XML format) to node that previously contained Base64 encoded content
targetNode.setTextContent(decodedContent); 

// transforming DOM structure to java.lang.String
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(doc), new StreamResult(writer));

// creating String from StringWriter in order to check result
String output = writer.getBuffer().toString();

但是转换后我得到了解码数据转义:

<RootElement>
  <SomeData />
  <Target>
&lt;Data&gt;
  &lt;Info&gt;Some text&lt;/Info&gt;
&lt;/Data&gt;
  </Target>
</RootElement>

有没有办法防止转换过程中出现这种情况?

0 个答案:

没有答案