使用Java完美打印XML

时间:2014-11-17 17:57:09

标签: java xml saxparser pretty-print xerces

关于该主题有十几个主题,但所有主题都包含对我不满意的答案。似乎需要使用特定的DOM实现。但是,我无法读取xml输入:

@Test
public void testPrettyPrintConvertDomLevel3() throws UnsupportedEncodingException {
    String unformattedXml
            = "<?xml version=\"1.0\" encoding=\"UTF-16\"?><QueryMessage\n"
            + "        xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n"
            + "        xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n"
            + "    <Query>\n"
            + "        <query:CategorySchemeWhere>\n"
            + "   \t\t\t\t\t         <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n"
            + "        </query:CategorySchemeWhere>\n"
            + "    </Query>\n\n\n\n\n"
            + "</QueryMessage>";

    System.out.println(prettyPrintWithXercesDomLevel3(unformattedXml.getBytes("UTF-16")));
}

以下是方法:

public static String prettyPrintWithXercesDomLevel3(byte[] input) {
    try {
//System.setProperty(DOMImplementationRegistry.PROPERTY,"org.apache.xerces.dom.DOMImplementationSourceImpl");
        DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
        DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("XML 3.0 LS 3.0");
        if (impl == null) {
            throw new RuntimeException("No DOMImplementation found !");
        }

        log.info(String.format("DOMImplementationLS: %s", impl.getClass().getName()));

        LSParser parser = impl.createLSParser(
                DOMImplementationLS.MODE_SYNCHRONOUS,
                //"http://www.w3.org/2001/XMLSchema");
                "http://www.w3.org/TR/REC-xml");
        log.info(String.format("LSParser: %s", parser.getClass().getName()));
        LSInput lsi = impl.createLSInput();
        lsi.setByteStream(new ByteArrayInputStream(input));
        Document doc = parser.parse(lsi);

        LSSerializer serializer = impl.createLSSerializer();
        serializer.getDomConfig().setParameter("format-pretty-print",Boolean.TRUE);
        LSOutput output = impl.createLSOutput();
        output.setEncoding("UTF-8");
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        output.setByteStream(baos);
        serializer.write(doc, output);
        return baos.toString();
//            return serializer.writeToString(doc);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

然而,漂亮的打印不起作用。有什么想法吗?

3 个答案:

答案 0 :(得分:0)

&#13;
&#13;
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Node;
import org.w3c.dom.bootstrap.DOMImplementationRegistry;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSSerializer;
import org.xml.sax.InputSource;

/**
 *
 * @author lananda
 */
public class PrettyXmlWriter {
    
     public static void main(String... args){
        String unformattedXml
                = "<?xml version=\"1.0\" encoding=\"UTF-16\"?>"
                + "<QueryMessage\n"
                + "        xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n"
                + "        xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n"
                + "    <Query>\n"
                + "        <query:CategorySchemeWhere>\n"
                + "   \t\t\t\t\t         <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n"
                + "        </query:CategorySchemeWhere>\n"
                + "    </Query>\n\n\n\n\n"
                + "</QueryMessage>";
        unformattedXml = unformattedXml.replaceAll("\\s+", " ");
        String format = format(unformattedXml);
        System.out.println(format);
    }

       public static String format(String xml) {
        try {
            final InputSource src = new InputSource(new StringReader(xml));
            final Node document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
            final Boolean keepDeclaration = Boolean.valueOf(xml.startsWith("<?xml"));

        //May need this: System.setProperty(DOMImplementationRegistry.PROPERTY,"com.sun.org.apache.xerces.internal.dom.DOMImplementationSourceImpl");
            final DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
            final DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
            final LSSerializer writer = impl.createLSSerializer();
            writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE); // Set this to true if the output needs to be beautified.
            writer.getDomConfig().setParameter("xml-declaration", keepDeclaration); // Set this to true if the declaration is needed to be outputted.
            return writer.writeToString(document);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
}
&#13;
&#13;
&#13;

答案 1 :(得分:0)

Java源文件的编码也必须与您尝试运行的编码相匹配。如果您使用的是Eclipse,则出于某种原因,默认编码为CP-1252。当我放入新版本的Eclipse时,我要做的第一件事就是将文件编码更改为UTF-8。

我使用了你的代码,因为我的源文件编码是UTF-8,所以它工作得很好。

答案 2 :(得分:0)

更新:似乎all whitespace is significant in XML:&#34;默认情况下,基于W3C XML规范,Oracle XML Developer's Kit(XDK)XML解析器会保留所有空格。&#34 ;。因此,使该功能成为公共API的一部分非常合理 NOT 。 org.jdom2提供了合理的实现:

@Test
public void testPrettyPrintConvertDomLevel3() throws UnsupportedEncodingException, JDOMException, IOException {
    String unformattedXml
            = "<?xml version=\"1.0\" encoding=\"UTF-16\"?><QueryMessage\n"
            + "        xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n"
            + "        xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n"
            + "    <Query>\n"
            + "        <query:CategorySchemeWhere>\n"
            + "   \t\t\t\t\t         <query:AgencyID>ECB \n </query:AgencyID>\n"
            + "        </query:CategorySchemeWhere>\n"
            + "    </Query>\n\n\n\n\n"
            + "</QueryMessage>";
    SAXBuilder builder = new SAXBuilder();
    Document doc = builder.build(new ByteArrayInputStream(unformattedXml.getBytes("UTF-16")));
    Format f = Format.getPrettyFormat();
    f.setLineSeparator(LineSeparator.NL);
    f.setTextMode(Format.TextMode.TRIM_FULL_WHITE);
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    new XMLOutputter(f).output(doc, baos);
    assertEquals("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
            + "<QueryMessage xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\" xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n"
            + "  <Query>\n"
            + "    <query:CategorySchemeWhere>\n"
            + "      <query:AgencyID>ECB \n"
            + " </query:AgencyID>\n"
            + "    </query:CategorySchemeWhere>\n"
            + "  </Query>\n"
            + "</QueryMessage>\n", baos.toString());
}