如何使用Java对XML文件进行少量编辑

时间:2016-05-17 17:48:42

标签: java xml

我正在尝试更改大型(5mb)XML文件中的单个值。我总是知道值将在前10行中,因此我不需要读取99%的文件。然而,似乎在Java中进行部分XML读取非常棘手。

在此图片中,您可以看到我需要访问的单个值。

我已经阅读了很多关于Java中的XML以及处理它的最佳实践。但是,在这种情况下,我不确定最好的方法是什么 - DOM,STAX或SAX解析器似乎都有不同的最佳用例场景 - 我不确定哪种最适合这个问题。因为我需要做的就是编辑一个值。

也许,我甚至不应该使用XML解析器,只使用正则表达式,但它看起来像是pretty bad idea to use regex on XML

希望有人能指出我正确的方向, 谢谢!

2 个答案:

答案 0 :(得分:2)

我会选择DOM而非SAX或StAX只是为了(相对)简单的API。是的,有一些样板代码可以填充DOM,但是一旦过了它就会很简单。

话虽如此,如果您的XML源是100或1000兆字节,其中一个流API将更适合。事实上,5MB不是我认为的大数据集,所以继续使用DOM并称之为一天:

import java.io.File;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import javax.xml.xpath.*;
import org.w3c.dom.*;

public class ChangeVersion
{
    public static void main(String[] args)
            throws Exception
    {
        if (args.length < 3) {
            System.err.println("Usage: ChangeVersion <input> <output> <new version>");
            System.exit(1);
        }

        File inputFile = new File(args[0]);
        File outputFile = new File(args[1]);
        int updatedVersion = Integer.parseInt(args[2], 10);

        DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder docBuilder = domFactory.newDocumentBuilder();
        Document doc = docBuilder.parse(inputFile);

        XPathFactory xpathFactory = XPathFactory.newInstance();
        XPath xpath = xpathFactory.newXPath();
        XPathExpression expr = xpath.compile("/PremiereData/Project/@Version");

        NodeList versionAttrNodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);

        for (int i = 0; i < versionAttrNodes.getLength(); i++) {
            Attr versionAttr = (Attr) versionAttrNodes.item(i);
            versionAttr.setNodeValue(String.valueOf(updatedVersion));
        }

        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        Transformer transformer = transformerFactory.newTransformer();

        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.transform(new DOMSource(doc), new StreamResult(outputFile));
    }
}

答案 1 :(得分:2)

您可以在阅读时使用StAX解析器编写XML。执行此操作时,您可以在解析时替换内容。在任何给定时间,使用StAX解析器只会在内存中包含部分xml。

public static void main(String [] args) throws Exception {
    final String newProjectId = "888";

    File inputFile = new File("in.xml");
    File outputFile = new File("out.xml");
    System.out.println("Reading " + inputFile);
    System.out.println("Writing " + outputFile);

    XMLInputFactory inFactory = XMLInputFactory.newInstance();
    XMLEventReader eventReader = inFactory.createXMLEventReader(new FileInputStream(inputFile));
    XMLOutputFactory factory = XMLOutputFactory.newInstance();
    XMLEventWriter writer = factory.createXMLEventWriter(new FileWriter(outputFile));
    XMLEventFactory eventFactory = XMLEventFactory.newInstance();


    boolean useExistingEvent; // specifies if we should use the event right from the reader
    while (eventReader.hasNext()) {
        XMLEvent event = eventReader.nextEvent();
        useExistingEvent = true;

        // look for our Project element
        if(event.getEventType() == XMLEvent.START_ELEMENT) {
            // read characters
            StartElement elemEvent = event.asStartElement();
            Attribute attr = elemEvent.getAttributeByName(QName.valueOf("ObjectID"));
            // check to see if this is the project we want 
            // TODO: put what logic you want here
            if("Project".equals(elemEvent.getName().getLocalPart()) && attr != null && attr.getValue().equals("1")) {
                Attribute versionAttr = elemEvent.getAttributeByName(QName.valueOf("Version"));

                // we need to make a list of new attributes for this element which doesnt include the Version a
                List<Attribute> newAttrs = new ArrayList<>(); // new list of attrs
                Iterator<Attribute> existingAttrs = elemEvent.getAttributes();
                while(existingAttrs.hasNext()) {
                    Attribute existing = existingAttrs.next();
                    // copy over everything but version attribute
                    if(!existing.getName().getLocalPart().equals("Version")) {
                        newAttrs.add(existing);
                    }
                }
                // add our new attribute for projectId
                newAttrs.add(eventFactory.createAttribute(versionAttr.getName(), newProjectId));

                // were using our own event instead of the existing one
                useExistingEvent = false;
                writer.add(eventFactory.createStartElement(elemEvent.getName(), newAttrs.iterator(), elemEvent.getNamespaces()));
            }
        }

        // persist the existing event.
        if(useExistingEvent) {
            writer.add(event);
        }

    }
    writer.close();
}