如何使用SAX解析器从大型XML文件中获取嵌入/嵌套XML

时间:2013-10-24 07:39:44

标签: java xml saxparser

我们正在对嵌入式/嵌套XML执行一些操作。我正在使用SAXParser来解析整个XML文件。我想获得带有标签和值的整个嵌套XML。例如我的XML看起来像。

我希望整个XML都在<ANY_ELEMENT> .....&lt; /ANY-ELEMENT>标记内。

<?xml version="1.0" encoding="UTF-8"?>
            <x:xMessage xmlns:x="http://www.connecture.com/integration/x" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                            xsi:schemaLocation="http://www.connecture.com/integration/x xMessageWrapper.xsd
                ">
                <x:xMessageHeader>
                    <Version>850</Version>
                    <Source>Source</Source>
                    <Target>target</Target>
                    <Timestamp>2013-12-31T12:00:00</Timestamp>
                    <RequestID>123456</RequestID>
                    <ResponseID>54321</ResponseID>
                    <Priority>3</Priority>
                    <Username>Deepak</Username>
                    <Password>Kumar</Password>
                </x:xMessageHeader>
                <x:xMessageBody>
                    <ANY-ELEMENT>
                        <xEnveloped_834A1 xsi:schemaLocation="....." xmlns="......."
                            ..........................
                    ..........................
                            some Complex XML
                        ..........................
                        ..........................
                        ..........................

                    </ANY-ELEMENT>

                 </x:XMessageBody>
        </x:XMessage>

处理程序类示例代码:

public class MessageWrapperHandler extends DefaultHandler {


    private boolean bActualMessage = false;
    private String actualMessage = null;
    private long lengthActualMessage=0;



    public void startElement(String uri, String localName, String qName, Attributes attributes) {

      if (qName.equalsIgnoreCase("ANY-ELEMENT")) {
            bActualMessage = true;
            //lengthActualMessage=How to know the length of Child XML
        }
    }
  public void characters(char ch[], int start, int length) {

         if (bActualMessage) {
            actualMessage = new String(ch, start, length);
            //trying to get embedded XML
            bActualMessage = false;
        }
    }

}

但是由于下一个元素是XML内容所以什么都没给我.SO如何实现它。 编辑<ANY-ELEMENT>之后您可以自由修改XML,例如将内容添加到CDATA

1 个答案:

答案 0 :(得分:0)

我建议使用StAX而不是SAX(自Java SE 6以来,JDK / JRE中包含StAX实现)。 StAX类似于SAX,除了没有将事件推送给你,你拉(请求)它们。

在下面的代码中,XMLStreamReader已前进到ANY-ELEMENT元素。一旦它处于正确的位置,您可以根据需要与其进行交互。

import javax.xml.stream.*;
import javax.xml.transform.stream.StreamSource;

public class Demo {

    public static void main(String[] args) throws Exception {
        XMLInputFactory xif = XMLInputFactory.newFactory();

        StreamSource xmlSource = new StreamSource("src/forum19559825/input.xml");
        XMLStreamReader xsr = xif.createXMLStreamReader(xmlSource);

        Demo demo = new Demo();
        demo.positionXMLStreamReaderAtAnyElement(xsr);
        demo.processAnyElement(xsr);
    }

    private void positionXMLStreamReaderAtAnyElement(XMLStreamReader xsr) throws Exception {
        while(xsr.hasNext()) {
            if(xsr.getEventType() == XMLStreamReader.START_ELEMENT && "ANY-ELEMENT".equals(xsr.getLocalName())) {
                break;
            }
            xsr.next();
        }
    }

    private void processAnyElement(XMLStreamReader xmlStreamReaderAtAnyElement) {
        // TODO: Stuff
        System.out.println("FOUND IT");
    }

}