关于该主题有十几个主题,但所有主题都包含对我不满意的答案。似乎需要使用特定的DOM实现。但是,我无法读取xml输入:
@Test
public void testPrettyPrintConvertDomLevel3() throws UnsupportedEncodingException {
String unformattedXml
= "<?xml version=\"1.0\" encoding=\"UTF-16\"?><QueryMessage\n"
+ " xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n"
+ " xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n"
+ " <Query>\n"
+ " <query:CategorySchemeWhere>\n"
+ " \t\t\t\t\t <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n"
+ " </query:CategorySchemeWhere>\n"
+ " </Query>\n\n\n\n\n"
+ "</QueryMessage>";
System.out.println(prettyPrintWithXercesDomLevel3(unformattedXml.getBytes("UTF-16")));
}
以下是方法:
public static String prettyPrintWithXercesDomLevel3(byte[] input) {
try {
//System.setProperty(DOMImplementationRegistry.PROPERTY,"org.apache.xerces.dom.DOMImplementationSourceImpl");
DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("XML 3.0 LS 3.0");
if (impl == null) {
throw new RuntimeException("No DOMImplementation found !");
}
log.info(String.format("DOMImplementationLS: %s", impl.getClass().getName()));
LSParser parser = impl.createLSParser(
DOMImplementationLS.MODE_SYNCHRONOUS,
//"http://www.w3.org/2001/XMLSchema");
"http://www.w3.org/TR/REC-xml");
log.info(String.format("LSParser: %s", parser.getClass().getName()));
LSInput lsi = impl.createLSInput();
lsi.setByteStream(new ByteArrayInputStream(input));
Document doc = parser.parse(lsi);
LSSerializer serializer = impl.createLSSerializer();
serializer.getDomConfig().setParameter("format-pretty-print",Boolean.TRUE);
LSOutput output = impl.createLSOutput();
output.setEncoding("UTF-8");
ByteArrayOutputStream baos = new ByteArrayOutputStream();
output.setByteStream(baos);
serializer.write(doc, output);
return baos.toString();
// return serializer.writeToString(doc);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
然而,漂亮的打印不起作用。有什么想法吗?
答案 0 :(得分:0)
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Node;
import org.w3c.dom.bootstrap.DOMImplementationRegistry;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSSerializer;
import org.xml.sax.InputSource;
/**
*
* @author lananda
*/
public class PrettyXmlWriter {
public static void main(String... args){
String unformattedXml
= "<?xml version=\"1.0\" encoding=\"UTF-16\"?>"
+ "<QueryMessage\n"
+ " xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n"
+ " xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n"
+ " <Query>\n"
+ " <query:CategorySchemeWhere>\n"
+ " \t\t\t\t\t <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n"
+ " </query:CategorySchemeWhere>\n"
+ " </Query>\n\n\n\n\n"
+ "</QueryMessage>";
unformattedXml = unformattedXml.replaceAll("\\s+", " ");
String format = format(unformattedXml);
System.out.println(format);
}
public static String format(String xml) {
try {
final InputSource src = new InputSource(new StringReader(xml));
final Node document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
final Boolean keepDeclaration = Boolean.valueOf(xml.startsWith("<?xml"));
//May need this: System.setProperty(DOMImplementationRegistry.PROPERTY,"com.sun.org.apache.xerces.internal.dom.DOMImplementationSourceImpl");
final DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
final DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
final LSSerializer writer = impl.createLSSerializer();
writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE); // Set this to true if the output needs to be beautified.
writer.getDomConfig().setParameter("xml-declaration", keepDeclaration); // Set this to true if the declaration is needed to be outputted.
return writer.writeToString(document);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
&#13;
答案 1 :(得分:0)
Java源文件的编码也必须与您尝试运行的编码相匹配。如果您使用的是Eclipse,则出于某种原因,默认编码为CP-1252。当我放入新版本的Eclipse时,我要做的第一件事就是将文件编码更改为UTF-8。
我使用了你的代码,因为我的源文件编码是UTF-8,所以它工作得很好。
答案 2 :(得分:0)
更新:似乎all whitespace is significant in XML:&#34;默认情况下,基于W3C XML规范,Oracle XML Developer's Kit(XDK)XML解析器会保留所有空格。&#34 ;。因此,使该功能成为公共API的一部分非常合理 NOT 。 org.jdom2提供了合理的实现:
@Test
public void testPrettyPrintConvertDomLevel3() throws UnsupportedEncodingException, JDOMException, IOException {
String unformattedXml
= "<?xml version=\"1.0\" encoding=\"UTF-16\"?><QueryMessage\n"
+ " xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n"
+ " xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n"
+ " <Query>\n"
+ " <query:CategorySchemeWhere>\n"
+ " \t\t\t\t\t <query:AgencyID>ECB \n </query:AgencyID>\n"
+ " </query:CategorySchemeWhere>\n"
+ " </Query>\n\n\n\n\n"
+ "</QueryMessage>";
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(new ByteArrayInputStream(unformattedXml.getBytes("UTF-16")));
Format f = Format.getPrettyFormat();
f.setLineSeparator(LineSeparator.NL);
f.setTextMode(Format.TextMode.TRIM_FULL_WHITE);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
new XMLOutputter(f).output(doc, baos);
assertEquals("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
+ "<QueryMessage xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\" xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n"
+ " <Query>\n"
+ " <query:CategorySchemeWhere>\n"
+ " <query:AgencyID>ECB \n"
+ " </query:AgencyID>\n"
+ " </query:CategorySchemeWhere>\n"
+ " </Query>\n"
+ "</QueryMessage>\n", baos.toString());
}