在Java中解析一个大的XML节点

时间:2013-11-11 04:42:18

标签: java xml actionscript-3 parsing

习惯了Actionscript 3 XML解析方法后,我发现Java解析方式有点压倒性。使用E4X我可以使用点符号和标签名称以及我可以到达我需要的节点的条件。我在Java中看不到这样的选项,我在网上查了很多例子,帮助解析大部分不超出基础的。

我是在使用DOM解析器的正确路径上还是应该尝试其他XML解析器?

我有一个很大的XML文件(我减少了适合),必须要解析。

如何在XML节点CANDELA中获得价值 - > ECUDOC - > ECU - > VAR - > DIAGCLASS - > DIAGINST?是否有可能在Actionscript中将节点集合作为XMLList?

DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = null;
        try {
            builder = builderFactory.newDocumentBuilder();
            Document document = builder.parse(new FileInputStream("pathToXML"));

            NodeList nodes = document.getElementsByTagName("*");

            for(int i = 0; i < nodes.getLength(); i++){
                System.out.println(nodes.item(i).getNodeName());
            }
            System.out.println("Number of childs under ECU "+ nodes.getLength());
        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        }

<?xml version='1.0' encoding='iso-8859-1' standalone='no'?>
<!DOCTYPE CANDELA SYSTEM 'candela.dtd'>
<CANDELA dtdvers='2.0.5'>
    <ECUDOC doctype='inst' manufacturer='no' mid='323232' saveno='59' languages='(en-US,de-DE)' uptodateLanguages='(en-US)' jobfileext=''>
        <ATTRCATS>
            <ATTRCAT id='_0x01ded158'>
                <NAME>
                    <TUV xml:lang='en-US'>Time</TUV>
                    <TUV xml:lang='de-DE'>Zeit</TUV>
                </NAME>
                <QUAL>Zeit</QUAL>
            </ATTRCAT>
        </ATTRCATS>
        <AUTHORS>
            <AUTHOR id='_0x01dfae70' obs='0'>
                <LASTNAME>Rätz</LASTNAME>
                <FIRSTNAME>Christoph</FIRSTNAME>
            </AUTHOR>
        </AUTHORS>
        <ECU id='_0x01da31c8'>
            <NAME>
                <TUV xml:lang='en-US'>Any ECU example</TUV>
                <TUV xml:lang='de-DE'>Ein Beispiel-Steuergerät</TUV>
            </NAME>
            <DESC>
                <TUV xml:lang='en-US' struct='1'>
                   <PARA>
                        <FC>This is an manufacturer independent example to demonstrate the usage of CANdelaStudio.</FC>
                    </PARA>
                    <PARA>
                        <FC></FC>
                    </PARA>
                    <PARA>
                        <FC>This example is based on the Vector document template (manufacturer independent).</FC>
                    </PARA>
                    <PARA>
                        <FC>For a concrete project, we recommend to use a manufacturer specific document template, which must be generated by Vector at the time.</FC>
                    </PARA>
                </TUV>
                <TUV xml:lang='de-DE' struct='1'>
                    <PARA>
                        <FC>Dies ist ein herstellerunabhängiges Beispiel. Es zeigt die Verwendung von CANdelaStudio.</FC>
                    </PARA>
                    <PARA>
                        <FC></FC>
                    </PARA>
                    <PARA>
                        <FC>Das Beispiel basiert auf der Hersteller-unabhängigen Vector-Dokumentvorlage.</FC>
                    </PARA>
                    <PARA>
                        <FC>Für ein konkretes Projekt sollten Sie eine Hersteller-spezifische Dokumentvorlage verwenden. Diese wird (zur Zeit noch) von Vector erstellt.</FC>
                    </PARA>
                </TUV>
            </DESC>
            <QUAL>Any_ECU_example</QUAL>
            <UNS attrref='_0x01233bc8' v='513'/>
            <UNS attrref='_0x01235af8' v='1025'/>
            <ENUM attrref='_0x01ddc9c0' v='0'/>
            <VAR id='_0x01dae6f0' base='1'>
                <NAME>
                    <TUV xml:lang='en-US'>Common Diagnostics</TUV>
                    <TUV xml:lang='de-DE'>Grundumfang</TUV>
                </NAME>
                <DESC>
                    <TUV xml:lang='en-US' struct='1'>
                        <PARA>
                            <FC fs='0'>Base model which all variants of the ECU must support</FC>
                        </PARA>
                    </TUV>
                    <TUV xml:lang='de-DE' struct='1'>
                        <PARA>
                            <FC fs='0'>Grundumfang, den alle Varianten des Steuergerätes unterstützen</FC>
                        </PARA>
                    </TUV>
                </DESC>
                <QUAL>COMMON_DIAGNOSTICS</QUAL>
                <DIAGCLASS id='_0x01db0320' tmplref='_0x01dce558'>
                    <NAME>
                        <TUV xml:lang='en-US'>Start Session</TUV>
                        <TUV xml:lang='de-DE'>Sitzungen starten</TUV>
                    </NAME>
                    <QUAL>START_SESSION</QUAL>
                    <DIAGINST id='_0x01dd0598' tmplref='_0x01dce558' req='0'>
                        <NAME>
                            <TUV xml:lang='en-US'>Default Session (OBDII)</TUV>
                            <TUV xml:lang='de-DE'>Default Session (OBDII)</TUV>
                        </NAME>
                        <QUAL>DEFAULT_SESSION</QUAL>
                        <SERVICE id='_0x01dd0720' tmplref='_0x01dce630' func='0' phys='1' mresp='0' respOnPhys='1' respOnFunc='0' req='0'>
                            <NAME>
                                <TUV xml:lang='en-US'>Start</TUV>
                                <TUV xml:lang='de-DE'>Starten</TUV>
                            </NAME>
                            <QUAL>Start</QUAL>
                        </SERVICE>
                        <STATICVALUE shstaticref='_0x01dbebb0' v='129'/>
                        <SIMPLECOMPCONT shproxyref='_0x01dbec18'>
                            <SPECDATAOBJ id='_0x01dfd658' spec='rc'>
                                <NAME>
                                    <TUV xml:lang='en-US'>Negative response codes</TUV>
                                    <TUV xml:lang='de-DE'>Negative response codes</TUV>
                                </NAME>
                                <QUAL>NRC</QUAL>
                                <TEXTTBL id='_0x01dda9c8' bm='4294967295'>
                                    <NAME>
                                        <TUV xml:lang='en-US'>LocalTable</TUV>
                                        <TUV xml:lang='de-DE'>LocalTable</TUV>
                                    </NAME>
                                    <QUAL>LocalTable</QUAL>
                                    <CVALUETYPE bl='8' bo='21' enc='uns' sig='0' df='hex' qty='atom' sz='no' minsz='0' maxsz='255'/>
                                    <PVALUETYPE bl='8' bo='21' enc='asc' sig='0' df='text' qty='field' sz='no' minsz='0' maxsz='255'/>
                                    <TEXTMAP s='16' e='16'>
                                        <TEXT>
                                            <TUV xml:lang='en-US'>General reject</TUV>
                                            <TUV xml:lang='de-DE'>Allgemeine Verweigerung</TUV>
                                        </TEXT>
                                    </TEXTMAP>
                                    <TEXTMAP s='18' e='18'>
                                        <TEXT>
                                            <TUV xml:lang='en-US'>Subfunction not supported - invalid format</TUV>
                                            <TUV xml:lang='de-DE'>Unterfunktion nicht unterstützt oder ungültiges Format</TUV>
                                        </TEXT>
                                    </TEXTMAP>
                                    <TEXTMAP s='120' e='120'>
                                        <TEXT>
                                            <TUV xml:lang='en-US'>Request correctly received - response pending</TUV>
                                            <TUV xml:lang='de-DE'>Anforderung erhalten - Antwort steht aus</TUV>
                                        </TEXT>
                                    </TEXTMAP>
                                    <TEXTMAP s='128' e='128'>
                                        <TEXT>
                                            <TUV xml:lang='en-US'>Service not supported in active diagnostic mode</TUV>
                                            <TUV xml:lang='de-DE'>Service nicht unterstützt in aktiver Session</TUV>
                                        </TEXT>
                                    </TEXTMAP>
                                </TEXTTBL>
                            </SPECDATAOBJ>
                        </SIMPLECOMPCONT>
                    </DIAGINST>
                    <DIAGINST id='_0x01dd1520' tmplref='_0x01dce558' req='0'>
                        <NAME>
                            <TUV xml:lang='en-US'>Programming Session</TUV>
                            <TUV xml:lang='de-DE'>Programming Session</TUV>
                        </NAME>
                        <QUAL>ProgrammingSession</QUAL>
                        <SERVICE id='_0x01dd1660' tmplref='_0x01dce630' func='0' phys='1' mresp='0' respOnPhys='1' respOnFunc='0' req='0'>
                            <NAME>
                                <TUV xml:lang='en-US'>Start</TUV>
                                <TUV xml:lang='de-DE'>Starten</TUV>
                            </NAME>
                            <QUAL>Start</QUAL>
                        </SERVICE>
                        <STATICVALUE shstaticref='_0x01dbebb0' v='133'/>
                        <SIMPLECOMPCONT shproxyref='_0x01dbec18'>
                            <SPECDATAOBJ id='_0x01daf378' spec='rc'>
                                <NAME>
                                    <TUV xml:lang='en-US'>Negative response codes</TUV>
                                    <TUV xml:lang='de-DE'>Negative response codes</TUV>
                                </NAME>
                                <QUAL>NRC</QUAL>
                                <TEXTTBL id='_0x01dd9da8' bm='4294967295'>
                                    <NAME>
                                        <TUV xml:lang='en-US'>LocalTable</TUV>
                                        <TUV xml:lang='de-DE'>LocalTable</TUV>
                                    </NAME>
                                    <QUAL>LocalTable</QUAL>
                                    <CVALUETYPE bl='8' bo='21' enc='uns' sig='0' df='hex' qty='atom' sz='no' minsz='0' maxsz='255'/>
                                    <PVALUETYPE bl='8' bo='21' enc='asc' sig='0' df='text' qty='field' sz='no' minsz='0' maxsz='255'/>
                                    <TEXTMAP s='16' e='16'>
                                        <TEXT>
                                            <TUV xml:lang='en-US'>General reject</TUV>
                                            <TUV xml:lang='de-DE'>Allgemeine Verweigerung</TUV>
                                        </TEXT>
                                    </TEXTMAP>
                                    <TEXTMAP s='18' e='18'>
                                        <TEXT>
                                            <TUV xml:lang='en-US'>Subfunction not supported - invalid format</TUV>
                                            <TUV xml:lang='de-DE'>Unterfunktion nicht unterstützt oder ungültiges Format</TUV>
                                        </TEXT>
                                    </TEXTMAP>
                                    <TEXTMAP s='120' e='120'>
                                        <TEXT>
                                            <TUV xml:lang='en-US'>Request correctly received - response pending</TUV>
                                            <TUV xml:lang='de-DE'>Anforderung erhalten - Antwort steht aus</TUV>
                                        </TEXT>
                                    </TEXTMAP>
                                    <TEXTMAP s='128' e='128'>
                                        <TEXT>
                                            <TUV xml:lang='en-US'>Service not supported in active diagnostic mode</TUV>
                                            <TUV xml:lang='de-DE'>Service nicht unterstützt in aktiver Session</TUV>
                                        </TEXT>
                                    </TEXTMAP>
                                </TEXTTBL>
                            </SPECDATAOBJ>
                        </SIMPLECOMPCONT>
                    </DIAGINST>
                </DIAGCLASS>
                <DIAGINST id='_0x01dd2458' tmplref='_0x01dbec98' req='0'>
                    <NAME>
                        <TUV xml:lang='en-US'>Stop Session</TUV>
                        <TUV xml:lang='de-DE'>Sitzungen beenden</TUV>
                    </NAME>
                    <QUAL>STOP_SESSION</QUAL>
                    <SERVICE id='_0x01dd2598' tmplref='_0x01dbed70' func='0' phys='1' mresp='0' respOnPhys='1' respOnFunc='0' req='0'>
                        <NAME>
                            <TUV xml:lang='en-US'>Stop</TUV>
                            <TUV xml:lang='de-DE'>Beenden</TUV>
                        </NAME>
                        <QUAL>Stop</QUAL>
                    </SERVICE>
                    <SIMPLECOMPCONT shproxyref='_0x01dbee18'>
                        <SPECDATAOBJ id='_0x01dda908' spec='rc'>
                            <NAME>
                                <TUV xml:lang='en-US'>Negative response codes</TUV>
                                <TUV xml:lang='de-DE'>Negative response codes</TUV>
                            </NAME>
                            <QUAL>NRC</QUAL>
                            <TEXTTBL id='_0x01237e70' bm='4294967295'>
                                <NAME>
                                    <TUV xml:lang='en-US'>LocalTable</TUV>
                                    <TUV xml:lang='de-DE'>LocalTable</TUV>
                                </NAME>
                                <QUAL>LocalTable</QUAL>
                                <CVALUETYPE bl='8' bo='21' enc='uns' sig='0' df='hex' qty='atom' sz='no' minsz='0' maxsz='255'/>
                                <PVALUETYPE bl='8' bo='21' enc='asc' sig='0' df='text' qty='field' sz='no' minsz='0' maxsz='255'/>
                                <TEXTMAP s='16' e='16'>
                                    <TEXT>
                                        <TUV xml:lang='en-US'>General reject</TUV>
                                        <TUV xml:lang='de-DE'>Allgemeine Verweigerung</TUV>
                                    </TEXT>
                                </TEXTMAP>
                                <TEXTMAP s='18' e='18'>
                                    <TEXT>
                                        <TUV xml:lang='en-US'>Subfunction not supported - invalid format</TUV>
                                        <TUV xml:lang='de-DE'>Unterfunktion nicht unterstützt oder ungültiges Format</TUV>
                                    </TEXT>
                                </TEXTMAP>
                                <TEXTMAP s='120' e='120'>
                                    <TEXT>
                                        <TUV xml:lang='en-US'>Request correctly received - response pending</TUV>
                                        <TUV xml:lang='de-DE'>Anforderung erhalten - Antwort steht aus</TUV>
                                    </TEXT>
                                </TEXTMAP>
                                <TEXTMAP s='128' e='128'>
                                    <TEXT>
                                        <TUV xml:lang='en-US'>Service not supported in active diagnostic mode</TUV>
                                        <TUV xml:lang='de-DE'>Service nicht unterstützt in aktiver Session</TUV>
                                    </TEXT>
                                </TEXTMAP>
                            </TEXTTBL>
                        </SPECDATAOBJ>
                    </SIMPLECOMPCONT>
                </DIAGINST>
            </VAR>
        </ECU>
    </ECUDOC>
</CANDELA>

2 个答案:

答案 0 :(得分:2)

如果您在ActionScript中对相同大小的数据使用DOM解析,那么您可以在Java中使用DOM解析;但是,我不建议使用旧的和笨拙的标准w3c API。诸如jdom2之类的现代DOM库将提供更多灵活性,例如按名称检索子节点。这是一个使用jdom2的例子。所有getChild()调用当然都可以用XPath替换。

Document doc = new SAXBuilder().build(new File("CANDELA.xml"));
List<Element> list = doc.getRootElement()
        .getChild("ECUDOC")
        .getChild("ECU")
        .getChild("VAR")
        .getChild("DIAGCLASS")
        .getChildren("DIAGINST");
System.out.println(list.size() + " DIAGINST nodes");

for (Element node : list) {
    System.out.println(node.getAttribute("id").getValue()
            + " = " + node.getChildText("QUAL"));
}

答案 1 :(得分:1)

DOM解析器将整个文档存储在内存中,这通常不适合大文件。它通过非常容易使用来弥补这一点,所以如果你可以将整个文件放在内存中,我会推荐它。

否则,另一个很好的选择是SAX。 http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/是一个关于如何使用它的好教程,它不会将整个文档加载到内存中,而是使用回调来让您响应不同的标记。

希望这是一个有用的起点!