应用错误收集

我有两份文件：

Document 1 (input)
Document 2 (output)

文档2是将文档1传递给转换过程的结果，该过程使任何内容和格式保持不变（通过Word中的并排比较验证）。

但是，该过程会从.docx文件中删除许多ID号。

例如，

      <w:p w:rsidP="00B600D6" w:rsidR="00F55D78" w:rsidRDefault="00B600D6">

成为

      <w:p>

根据每个文档的转储通过以下代码：

Body body = ((Document)newerPackage.getMainDocumentPart().getJaxbElement()).getBody();
Node node = org.docx4j.XmlUtils.marshaltoW3CDomDocument(body).getDocumentElement();
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
transformer.transform(new DOMSource(node), 
             new StreamResult(new OutputStreamWriter(System.out, "UTF-8")));

使用docx4j Differencer comparison method recommended here，所有内容（未应用格式的第一行除外）都显示为修改。

问题是：差异是缺少id，格式还是其他的结果？

如果它很重要，我们在此上下文中使用docx4j对我们的往返过程执行自动化的健全/回归测试（即应用“无损”过程并且期望没有差异）

docx4j差异显示出比预期更多的差异

1 个答案: