使用Java删除XML字段中的空格

时间:2014-12-17 07:44:33

标签: java xml whitespace nsxmlparser removing-whitespace

我在删除xml数据中值字段中的空格时遇到了问题。

例如:

输入

<?xml version="1.0"?>
<ns:myOrder xmlns:ns="http://w3schools.com/BusinessDocument" xmlns:ct="http://something.com/CommonTypes">
  <MessageHeader>
     <ct:ID>i7                           </ct:ID>
     <ct:ID>i7                           </ct:ID>
     <ct:ID>i7                           </ct:ID>
     <ct:ID>i7                           </ct:ID>
     <ct:Name> Company Name           </ct:Name>
 </MessageHeader>
</ns:myOrder>

预期产出:

<?xml version="1.0"?>
  <ns:myOrder xmlns:ns="http://w3schools.com/BusinessDocument" xmlns:ct="http://something.com/CommonTypes">
    <MessageHeader>
       <ct:ID>i7</ct:ID>
       <ct:ID>i7</ct:ID>
       <ct:ID>i7</ct:ID>
       <ct:ID>i7</ct:ID>
       <ct:Name>Company Name</ct:Name>
    </MessageHeader>
  </ns:myOrder>

我尝试使用以下代码

public static String getTrimmedXML(String rawXMLFilename) throws Exception
     {
          BufferedReader in = new BufferedReader(new FileReader(rawXMLFilename));
     String str;
     String trimmedXML = null;     
     while ((str = in.readLine()) != null) 
     {
          String str1 = str;
          if (str1.length()>0) 
          {
               str1 = str1.trim();
               if(str1.charAt(str1.length()-1) == '>')
               {
                    trimmedXML = trimmedXML + str.trim();
               }
               else
               {
                    trimmedXML = trimmedXML + str;
               }
          }
     }     
     in.close();
     return trimmedXML.substring(4);
     }

我无法删除这些空格。请让我知道我哪里出错了

此致 Monish

5 个答案:

答案 0 :(得分:2)

您可能不想使用replace或replace all,因为它将替换xml数据中的所有空格。如果要修剪xml内容的开始/结束,要么要解析整个xml,要么使用xpath将其转换回字符串。使用以下代码。

public static String getTrimmedXML(String rawXMLFilename, String tagName) throws Exception {
    // Create xml document object
    BufferedReader in = new BufferedReader(new FileReader(rawXMLFilename));
    InputSource source = new InputSource(in);
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = dbf.newDocumentBuilder();
    Document document = db.parse(source);
    XPathFactory xpathFactory = XPathFactory.newInstance();
    XPath xpath = xpathFactory.newXPath();

    // Path to the node that you want to trim
    NodeList nodeList = (NodeList) xpath.compile("//*[name()='" + tagName + "']").evaluate(document, XPathConstants.NODESET);
    for (int index = 0; index < nodeList.getLength(); index++) { // Loop through all nodes that match the xpath
        Node node = nodeList.item(index);
        String newTextContent = node.getTextContent().trim(); // Actual trim process
        node.setTextContent(newTextContent);
    }

    // Transform back the document to string format.
    TransformerFactory tf = TransformerFactory.newInstance();
    Transformer transformer = tf.newTransformer();
    transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    StringWriter writer = new StringWriter();
    transformer.transform(new DOMSource(document), new StreamResult(writer));
    String output = writer.getBuffer().toString().replaceAll("\n|\r", "");
    return output;
}

答案 1 :(得分:0)

恕我直言,您应该使用a XML library,然后可能会选择受影响的Nodes via XPath,然后

String value = node.getTextContent();
node.setTextContent(value.trim());

答案 2 :(得分:0)

删除字符串中的所有空格可以使用String类的替换方法完成,如下所示:

String str = " random    message withlots   of white  spaces     ";
str = str.replace(" ", "");
System.out.println(str);

上面将运行以打印没有任何空格的str。 replace方法有两个参数 - 第一个是你希望方法用第二个参数替换的String - 这是另一个String。此方法的参数不仅限于单字符字符串。

答案 3 :(得分:0)

以下是在vtd-xml中执行空格删除的代码。

Schema::create($tableName, function (Blueprint $table) {
            $table->increments('id');
            $table->string('date');
            $table->integer('sold');
            $table->integer('sold_diff');
            $table->float('rev');
            $table->float('rev_diff');
            $table->string('row');
            $table->string('date_col');
            $table->string('sold_col');
            $table->string('rev_col');
            $table->timestamps();
        });

答案 4 :(得分:-3)

在java中使用replaceAll方法

例如

String s1 = "<ct:ID>i7                           </ct:ID>";
System.out.println(s1.replaceAll(" ","").trim());