Java 8-使用Stax分割巨大的XML文件会产生意外的结果

时间:2019-04-26 17:51:33

标签: java stax

当拆分一个巨大的XML文件时,我看到了一个使用Stax和Transformer.transform()的非常好的解决方案。很好,但是我看到一些标签丢失了。为什么?

具有Name ...的XML文件给出以下结果。在EVENT场合,元素标签被省略。

Element: <?xml version="1.0" encoding="UTF-8"?><car><name>car1</name></car>
Element: <?xml version="1.0" encoding="UTF-8"?><name>car2</name>
Element: <?xml version="1.0" encoding="UTF-8"?><car><name>car3</name></car>
Element: <?xml version="1.0" encoding="UTF-8"?><name>car4</name>

如何获取正确的元素?这与那个transform(s,r)干扰输入流读取吗?

这是我的代码(在this one等很多地方都看到过)。使用StringReader或FileReader时没有变化。

我期望这样:循环{前进到开始标签;获得对该元素的访问权} 我看到的是:第一:元素+第二:元素的一部分+重复。

String testCars = "<root><car><name>car1</name></car><car><name>car2</name></car><car><name>car3</name></car><car><name>car4</name></car></root>";
String element = "car";
try {
    XMLInputFactory factory = XMLInputFactory.newInstance();
    XMLStreamReader streamReader = factory.createXMLStreamReader(new StringReader(testCars));
    streamReader.nextTag();
    TransformerFactory tf = TransformerFactory.newInstance();
    Transformer t = tf.newTransformer();
    while(streamReader.nextTag() == XMLStreamConstants.START_ELEMENT) {
            StringWriter writer = new StringWriter();
            StreamResult result = new StreamResult(writer);
            t.transform(new StAXSource(streamReader), result);
            System.out.println("Element: " + writer.toString());
    }
} catch (Exception e) { ... }

1 个答案:

答案 0 :(得分:1)

感谢Andreas,这是解决方案:

tests:
    echo "Testcase 1 $(testname)..."; \
    $(MAKE) -e TESTCASE=1 guimode=no run > test.tc1.log; \ # must save variable TESTCASE_LIST = {1} or similar
    $(MAKE) -e TESTCASE=2 guimode=no run > test.tc2.log; \ # must append to variable TESTCASE_LIST = {1 2}
    $(MAKE) -e TESTCASE=2 guimode=no run > test.tc3.log; \ # must append to variable TESTCASE_LIST = {1 2 3}
    echo "Completed Tests at time $(realtime) ..."; \
    $(MAKE) check_test_results;  # must run through results of tests 1,2,3 and get data

check_test_results:
    for testcase in $(TESTCASE_LIST); do something; done

输入为:

String testCars = "<root><car><name>car1</name></car><other><something>Unknown</something></other><car><name>car2</name></car></root>";
XMLInputFactory factory = XMLInputFactory.newInstance();
try {
    XMLStreamReader streamReader = factory.createXMLStreamReader(new StringReader(testCars));
    streamReader.nextTag();
    TransformerFactory tf = TransformerFactory.newInstance();
    Transformer t = tf.newTransformer();
    streamReader.nextTag();
    while ( streamReader.isStartElement() ||
          ( ! streamReader.hasNext() && streamReader.nextTag() == XMLStreamConstants.START_ELEMENT)) {
        StringWriter writer = new StringWriter();
        StreamResult result = new StreamResult(writer);
        t.transform(new StAXSource(streamReader), result);
        System.out.println( "XmlElement: " + writer.toString());
    }
} catch (Exception e) { ... }

输出为:

<root>
  <car>
    <name>car1</name>
  </car>
  <other>
    <something>Unknown</something>
  </other>
  <car>
    <name>car2</name>
  </car>
</root>