Question

我正在用Java编写程序，其最终目标是从用户输入创建XML文件。我能够预测几乎所有元素都需要在哪里，但有一个例外。

用户输入的任何一个句子都放入其自己的Callout标记中：

<Callout>The user entered some text.</Callout>

如果句子中包含短语“用户指南”，则程序需要使用此XML标记自动包围这两个单词：

<BookTitle></BookTitle>

例如，初始标记如下所示：

<Callout>Save the User Guide</Callout>

最终结果应为：

<Callout>Save the <BookTitle>User Guide</BookTitle>.</Callout>

请注意，“用户指南”一词可能出现在“标注”标记内的任何位置。我不确定如何动态地将标签添加到文本节点的中间。这甚至可能吗？我尝试了这里找到的解决方案（Convert String XML fragment to Document Node in Java）但无济于事。我正在使用org.w3c.dom来创建元素，节点等。

Answer 1

人们将有不同的方法来处理XML操作。对这个问题的各种答案足以证明这一点。虽然使用正则表达式和原始文本操作可能适用于一次性修复和黑客，如果您需要一个良好且可维护的解决方案，您应该使用XML API。

下面我的例子确实提出了问题，但应该注意的是，我没有检查任何病态输入，例如（""作为搜索字符串）或处理XML命名空间。这些东西可以很容易地添加。

请参阅代码中的注释，了解其工作原理。

输入（test.xml）：

<Callouts>
  <Callout>Save the User Guide.</Callout>
</Callouts>

程序

package com.stackoverflow._18774666;

import java.net.URL;

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.w3c.dom.Text;

public class InsertElementInTextNode {

    /**
 * Replace text content of child text nodes of a parent element that
 * matches a search string. The text is replace by an element named with the
 * given name and has it's text content set equal to the search string.
 * 
 * @param parent
 *            The element to search child text nodes of.
 * @param elementName
 *            The name of the element to insert.
 * @param text
 *            The text to replace with an element with same text content.
 */
    public static void replaceTextWithElement(Element parent, String elementName, String text){

        NodeList children = parent.getChildNodes();
        Text cursor;
        Element insertedElement;
        int index;

        /* Iterate children of the given element. */
        for(int i = 0; i < children.getLength(); i++ ){

            /* Check if this child is a text node. Ignore otherwise. */
            if(children.item(i) instanceof Text){
                cursor = (Text) children.item(i);

                /* If the entire text node is equal to the search string,
                 * then we can replace it directly. Else we have split it.*/
                if(text.equals(cursor.getData())){
                    /* Replace the text node with an element */
                    insertedElement = parent.getOwnerDocument().createElement(elementName);
                    insertedElement.setTextContent(text);
                    parent.replaceChild(insertedElement, cursor);
                } else {
                    /* Check to see if the search string exists in this text node. Ignore otherwise.*/
                    index = cursor.getData().indexOf(text);
                    if(index != -1){

                        /* Replace the matched substring with an empty string.*/
                        cursor.replaceData(index, text.length(), "");

                        /* Create element to be inserted, and set the text content. */
                        insertedElement = parent.getOwnerDocument().createElement(elementName);
                        insertedElement.setTextContent(text);

                        /* Split the text node and insert the element in the middle. */
                        parent.insertBefore(insertedElement, cursor.splitText(index));
                    }
                }
            }
        }
    }

    public static void main(String[] args) throws Exception {

        /* Location of our XML document. */
        URL xmlSource = InsertElementInTextNode.class.getResource("test.xml");

        /* Parse with DOM in to a Document */
        Document xmlDoc = DocumentBuilderFactory.newInstance()
                .newDocumentBuilder().parse(xmlSource.openStream());

        /* Find our interesting elements. */
        NodeList nodes = xmlDoc.getElementsByTagName("Callout");


        /* Iterate through our interesting elements and check their content.*/
        Element cursor;
        for(int i = 0; i < nodes.getLength(); i++ ){
            if(nodes.item(i) instanceof Element){
                cursor = (Element) nodes.item(i);
                replaceTextWithElement(cursor, "BookTitle", "User Guide");
            }
        }


        /* Setup to output result. */
        Transformer transformer = TransformerFactory.newInstance().newTransformer();
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        transformer.setOutputProperty(OutputKeys.METHOD, "xml");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

        /* Printing result to stdout. */
        transformer.transform(new DOMSource(xmlDoc), 
             new StreamResult(System.out));
    }

}

输出（标准输出）：

<Callouts>
  <Callout>Save the <BookTitle>User Guide</BookTitle>.</Callout>
</Callouts>

Answer 2

首先获取用户输入并将其与用户指南进行比较，如果确实如此，则用标题围绕它。

String UserInput = null;

UserInput = #getUserType.toString();
if(UserInput.equals("User Guide")){
UserInput = "<BookTitle>"+UserInput+"<BookTitle>";
}else{
//do things if it's false not User Guide
}

Answer 3

public static void main(String[] args) {
    String str = "This is my User Guide dude";
    boolean bTest = str.contains("User Guide");
    if (bTest) {
        int index1 = str.indexOf("User Guide");
        String sub = str.substring(index1, index1 + 10);
        sub = "<BookTitle>" + sub + "</BookTitle>";
        String result = str.replace("User Guide", sub);
        System.out.println(result);
    }
}

输出：

This is my <BookTitle>User Guide</BookTitle> dude

我认为这至少会指出你正确的方向。

Answer 4

不知道你是如何构建XML文档的，我假设使用了某种DOM。

这是一个非常简单的概念验证示例。它从图书馆代码中大量借用，所以它可能会有些臃肿，但基本的想法应该是合理的......

基本上，在这种情况下，它会在String中搜索给定的令牌（User Guide），并在其周围拆分原始文本，适当添加#text个节点和<BookTitle>个节点...

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Text;

public class Main {

    public static final String BOOK_TITLE = "User Guide";

    public static void main(String[] args) {

        Document doc = null;
        try {
            doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
            Element root = doc.createElement("root");
            doc.appendChild(root);

            // Create callout node...
            Element callOut = doc.createElement("CallOut");

            // Get the user input...
            String text = "This is an example of a User Guide for you to read";
            // Does it contain our marker...?
            if (text.contains(BOOK_TITLE)) {
                // While the text contains the mark, continue looping...
                while (text.contains(BOOK_TITLE)) {
                    // Get the text before the marker...
                    String prefix = text.substring(0, text.indexOf(BOOK_TITLE));
                    // Get the text after the marker...
                    text = text.substring(text.indexOf(BOOK_TITLE) + BOOK_TITLE.length());
                    // If there is text before the marker, append it to the call out node
                    if (prefix.length() > 0) {
                        Text textNode = doc.createTextNode(prefix);
                        callOut.appendChild(textNode);
                    }
                    // Append the book title node...
                    Element bookTitle = doc.createElement("BookTitle");
                    bookTitle.setTextContent(BOOK_TITLE);
                    callOut.appendChild(bookTitle);
                }
                // If there is any text remaining, append it to the call out node...
                if (text.length() > 0) {
                    Text textNode = doc.createTextNode(text);
                    callOut.appendChild(textNode);
                }
            } else {
                // No marker, append the text to the call out node..
                Text textNode = doc.createTextNode(text);
                callOut.appendChild(textNode);
            }

            // This will dump the result for you to test....
            root.appendChild(callOut);
            ByteArrayOutputStream baos = null;
            OutputStreamWriter osw = null;

            try {

                baos = new ByteArrayOutputStream();
                osw = new OutputStreamWriter(baos);

                Transformer tf = TransformerFactory.newInstance().newTransformer();
                tf.setOutputProperty(OutputKeys.INDENT, "yes");
                tf.setOutputProperty(OutputKeys.METHOD, "xml");
                tf.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

                DOMSource domSource = new DOMSource(doc);
                StreamResult sr = new StreamResult(osw);
                tf.transform(domSource, sr);

                osw.flush();
                baos.flush();
                System.out.println(new String(baos.toByteArray()));

            } finally {

                try {
                    baos.close();
                } catch (Exception exp) {
                }

            }

        } catch (IOException | TransformerException | ParserConfigurationException ex) {
            ex.printStackTrace();
        }
    }
}

将XML元素动态插入文本节点

4 个答案:

输入（test.xml）：

程序

输出（标准输出）：