使用Java,我想采用以下格式的文档:
<tag1>
<tag2>
<![CDATA[ Some data ]]>
</tag2>
</tag1>
并将其转换为:
<tag1><tag2><![CDATA[ Some data ]]></tag2></tag1>
我尝试了以下内容,但它没有给我我期待的结果:
DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
dbfac.setIgnoringElementContentWhitespace(true);
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.parse(new FileInputStream("/tmp/test.xml"));
Writer out = new StringWriter();
Transformer tf = TransformerFactory.newInstance().newTransformer();
tf.setOutputProperty(OutputKeys.INDENT, "no");
tf.transform(new DOMSource(doc), new StreamResult(out));
System.out.println(out.toString());
答案 0 :(得分:17)
工作解决方案遵循@Luiggi Mendoza的问题评论中的说明。
public static String trim(String input) {
BufferedReader reader = new BufferedReader(new StringReader(input));
StringBuffer result = new StringBuffer();
try {
String line;
while ( (line = reader.readLine() ) != null)
result.append(line.trim());
return result.toString();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
答案 1 :(得分:5)
以递归方式遍历文档。删除任何包含空白内容的文本节点。修剪任何包含非空白内容的文本节点。
public static void trimWhitespace(Node node)
{
NodeList children = node.getChildNodes();
for(int i = 0; i < children.getLength(); ++i) {
Node child = children.item(i);
if(child.getNodeType() == Node.TEXT_NODE) {
child.setTextContent(child.getTextContent().trim());
}
trimWhitespace(child);
}
}
答案 2 :(得分:5)
如an answer to another question中所述,相关函数将是DocumentBuilderFactory.setIgnoringElementContentWhitespace(),但是 - 正如此处已经指出的那样 - 该函数需要使用验证解析器,这需要XML模式,或者某些此类函数。
因此,最好的办法是遍历从解析器获取的Document,并删除所有TEXT_NODE类型的节点(或那些只包含空格的TEXT_NODE)。
答案 3 :(得分:0)
Java8 + transformer不会创建任何东西,但是Java10 + transformer会在各处放置空行。我仍然想保持缩进。这是我的帮助函数,可以从任何DOMElement实例(例如doc.getDocumentElement()
根节点)创建xml字符串。
public static String createXML(Element elem) throws Exception {
DOMSource source = new DOMSource(elem);
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
//transformer.setOutputProperty("http://www.oracle.com/xml/is-standalone", "yes");
transformer.setOutputProperty(OutputKeys.DOCTYPE_PUBLIC,"yes");
transformer.setOutputProperty("http://www.oracle.com/xml/is-standalone", "yes");
transformer.transform(source, result);
// Java10-transformer adds unecessary empty lines, remove empty lines
BufferedReader reader = new BufferedReader(new StringReader(writer.toString()));
StringBuilder buf = new StringBuilder();
try {
final String NL = System.getProperty("line.separator", "\r\n");
String line;
while( (line=reader.readLine())!=null ) {
if (!line.trim().isEmpty()) {
buf.append(line);
buf.append(NL);
}
}
} finally {
reader.close();
}
return buf.toString(); //writer.toString();
}
答案 4 :(得分:0)
我支持@jtahlborn的回答。为了完整起见,我调整了他的解决方案以完全删除仅包含空格的元素,而不仅仅是清除它们。
public static void stripEmptyElements(Node node)
{
NodeList children = node.getChildNodes();
for(int i = 0; i < children.getLength(); ++i) {
Node child = children.item(i);
if(child.getNodeType() == Node.TEXT_NODE) {
if (child.getTextContent().trim().length() == 0) {
child.getParentNode().removeChild(child);
i--;
}
}
stripEmptyElements(child);
}
}
答案 5 :(得分:-4)
试试这段代码。
read和write
方法忽略空格和缩进。
try {
File f1 = new File("source.xml");
File f2 = new File("destination.xml");
InputStream in = new FileInputStream(f1);
OutputStream out = new FileOutputStream(f2);
byte[] buf = new byte[1024];
int len;
while ((len = in.read(buf)) > 0){
out.write(buf, 0, len);
}
in.close();
out.close();
System.out.println("File copied.");
} catch(FileNotFoundException ex){
System.out.println(ex.getMessage() + " in the specified directory.");
System.exit(0);
} catch(IOException e7){
System.out.println(e7.getMessage());
}