使用java进行xml解析(w3.org)

时间:2015-06-29 04:01:26

标签: java xml

我需要解析XML,下面给出了XML的一些部分:

<?xml version="1.0" encoding="utf-8"?>
<Document>
    <Sitemap>
        <TreeMap>
            <RootNodes>
                <TreeMapNode>
                    <NodeType>PackageHandle</NodeType>
                    <NodeValue>Page</NodeValue>
                    <ChildNodes />
                </TreeMapNode>
            </RootNodes>
        </TreeMap>
    </Sitemap>

    <Mastermap>
        <TreeMap>
            <RootNodes>
                <TreeMapNode>
                    <NodeType>Folder</NodeType>
                    <NodeValue>Template</NodeValue>
                    <ChildNodes>
                        <TreeMapNode>
                            <NodeType>PackageHandle</NodeType>
                            <NodeValue>Master Page</NodeValue>
                            <ChildNodes />
                        </TreeMapNode>
                    </ChildNodes>
                </TreeMapNode>
            </RootNodes>
        </TreeMap>
    </Mastermap>

    <Pages>
        <Page>
            <Diagram>
                <Widgets>
                    <Image>
                        <Name/>
                        <Rectangle>
                            <Rectangle X="0" Y="4" Width="130" Height="28" />
                        </Rectangle>
                        <Bold>False</Bold>
                        <BorderColor>Color(argb) = (255, 0, 0, 0)</BorderColor>
                        <BorderWidth>-1</BorderWidth>
                        <FillColor>Color(argb) = (255, 255, 255, 255)</FillColor>
                        <FontName>Arial</FontName>
                        <FontSize>9.75</FontSize>
                        <ForeColor>Color(argb) = (255, 0, 0, 0)</ForeColor>
                        <HorizontalAlignment>Center</HorizontalAlignment>
                        <Italic>False</Italic>
                        <Underline>False</Underline>
                        <VerticalAlignment>Center</VerticalAlignment>
                        <Widgets>
                            <TextPanel>
                                <Html>&lt;p style="font-size:13px;text-align:center;line-height:normal;"&gt;&lt;span style="font-family:'Arial Regular', 'Arial';font-weight:400;font-style:normal;font-size:13px;color:#000000;text-align:center;line-height:normal;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;</Html>
                                <Name />
                                <Rectangle>
                                    <Rectangle X="2" Y="6" Width="126" Height="16" />
                                </Rectangle>
                                <Bold>False</Bold>
                                <BorderColor>Color(argb) = (255, 0, 0, 0)</BorderColor>
                                <BorderWidth>-1</BorderWidth>
                                <FillColor>Color(argb) = (255, 255, 255, 255)</FillColor>
                                <FontName>Arial</FontName>
                                <FontSize>9.75</FontSize>
                                <ForeColor>Color(argb) = (255, 0, 0, 0)</ForeColor>
                                <HorizontalAlignment>Center</HorizontalAlignment>
                                <Italic>False</Italic>
                                <Underline>False</Underline>
                                <VerticalAlignment>Center</VerticalAlignment>
                            </TextPanel>
                        </Widgets>
                    </Image>
                    <Shape>

我需要读取它并以所需的XML格式编写它。 我的代码如下:

public static void main(String[] args) throws SAXException, IOException,ParserConfigurationException, TransformerException 
    {

        DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
        Document document = docBuilder.parse(new File("C:/Users/ve00p5199/Desktop/Axure.xml"));
        NodeList nodeList = document.getElementsByTagName("*");
        System.out.println("total nodes="+nodeList.getLength());

        for (int i = 0; i < nodeList.getLength(); i++) {
            Node node = nodeList.item(i);
            if(node.getNodeType() != Node.ELEMENT_NODE){
                System.out.print(node.getNodeName()+"= ");
                 System.out.println(node.getTextContent());
            }
            else if (node.getNodeType() == Node.ELEMENT_NODE) {
                // do something with the current element
                 System.out.print(node.getNodeName()+"= ");
                 System.out.println(((Node) node.getChildNodes()).getNodeValue());  //giving NULL
System.out.println(node.getNodeValue());//giving NULL
            }
        }
    }
}

我想用它的值打印TAGS。请建议使用其值保存/打印TAG NAME所需的方法。

1 个答案:

答案 0 :(得分:1)

您需要使用以下两个函数来获取标记名及其文本内容 -

tag = ((Element)Node).getTagName() //or you can also use Node.getNodeName()
textValue = Node.getTextContent()

如果您不想要后代的文本内容,则必须获取每个节点的子节点并过滤出类型为Node.TEXT_NODE的节点,然后仅为那些节点打印textContent TEXT_NODE

示例 -

else if (node.getNodeType() == Node.ELEMENT_NODE) {
    // do something with the current element
    System.out.print(node.getNodeName()+"= ");
    NodeList cNodes = node.getChildNodes();
    for(int j = 0;j< cNodes.getLength();j++) {
        Node cN = cNodes.item(j);
        if(cN.getNodeType() == Node.TEXT_NODE) {
             System.out.println(cN.getTextContent());
        }
    }
}

请注意,这也会提供很多文字内容,只需要换行和内容,您可以添加自己的额外代码,以便在需要时将其过滤掉。