展平java中的任何xml记录

时间:2013-09-27 18:16:42

标签: java xml

所以基本上,我在解决这个问题时遇到了问题。

基本上,我可以使用任何类型的xml ..我必须解析它。

我能够成功解析“flat”xml。 例如:

<emp>
<id>1</id>
<name>foo</name>
<age>22</age>
</emp>

我的简单解析器可以正常工作(注意架构是可变的......任何平面xml(没有硬编码)..

但它对嵌套的xml内容失败了 所以

<emplist>
<emp>
   <manager>
   <id>1</id>
   <name>foo</name>
   </manager>
</emp>
<emp>
   <clerk>
   <cid>1</cid>
   <cname>foo</cname>
   </clerk>
</emp>
 </emplist>

我想要的第一个案例的输出(我得到的是)

id,1
name,foo

但是我希望

id, 1
name, foo
cid, 1
cname,foo

我如何平衡这一点。 感谢

当前代码:

public class XMLReader {
    public static void main(String[] args) throws JDOMException, IOException {

        //String xmlString = "<employee >\n <firstname xml:space=\"preserve\" >John</firstname>\n <lastname>Watson</lastname>\n <age>30</age>\n <email>johnwatson@sh.com</email>\n</employee>";
        String xmlString = "<employee>\n" + 
                "       <personal><id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>\n" + 
                "       <name>Lareina</name>\n" + 
                "       <age>50</age>\n" + 
                "       </personal><contact><dept>Fusce</dept>\n" + 
                "       <manager>B55E6DA8-76BD-A3C8-2DDF-686CB9A0BB76</manager></contact>\n" + 
                "   </employee>";
        System.out.println(xmlString);


        SAXBuilder builder = new SAXBuilder();
        Reader in = new StringReader(xmlString);

        Document doc = builder.build(in);
        Element root = doc.getRootElement();
        List children = root.getChildren();
        //System.out.println(children);
        String value = "";
        for (int i = 0; i < children.size(); i++) {

                Element dataNode = (Element) children.get(i);
               // Element dataNode = (Element) dataNodes.get(j);
                value += ", " +dataNode.getText().trim();
                System.out.println(dataNode.getName() + " : " + dataNode.getText());

                //context.write(new Text(rowKey.toString()), new Text(node.getName().trim() + " " + node.getText().trim()));

            }
        //System.out.println(in);



    }
}

1 个答案:

答案 0 :(得分:2)

基于StAX而不是DOM的简单实现。但是如果你愿意,你可以很容易地将它转换为DOM(你需要使用递归)。

import java.io.IOException;
import java.io.StringReader;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.XMLEvent;


public class FlattenXmlExample
{
  private static XMLInputFactory inFactory = XMLInputFactory.newInstance();

  public static void main(String[] args) throws XMLStreamException, IOException
  {

    String xmlRecord =
        "<emplist>\n" +
        "<emp>\n" +
        "   <manager>\n" +
        "   <id>1</id>\n" +
        "   <name>foo</name>\n" +
        "   </manager>\n" +
        "</emp>\n" +
        "<emp>\n" +
        "   <clerk>\n" +
        "   <cid>1</cid>\n" +
        "   <cname>foo</cname>\n" +
        "   </clerk>\n" +
        "</emp>\n" +
        " </emplist>";

    String flatXmlRecord = flattenXmlRecord(xmlRecord);

    System.out.print(flatXmlRecord);
  }

  private static String flattenXmlRecord(final String xmlRecord) throws XMLStreamException
  {
    StringBuilder flatXmlRecord = new StringBuilder();

    XMLEventReader eventReader = inFactory.createXMLEventReader(new StringReader(xmlRecord));

    while (eventReader.hasNext())
    {
      XMLEvent event = eventReader.nextEvent();

      if (event.getEventType() == XMLEvent.START_ELEMENT )
      {
        String elementName = event.asStartElement().getName().getLocalPart();


        event = eventReader.nextEvent();
        if(event.getEventType() == XMLEvent.CHARACTERS)
        {
          if(!event.asCharacters().getData().trim().isEmpty())
          {
            flatXmlRecord.append(elementName + ", " + event.asCharacters().getData() + "\n");
          }
        }
      }
    }

    return flatXmlRecord.toString();
  }
}

输入:

<emplist>
<emp>
   <manager>
   <id>1</id>
   <name>foo</name>
   </manager>
</emp>
<emp>
   <clerk>
   <cid>1</cid>
   <cname>foo</cname>
   </clerk>
</emp>
 </emplist>

输出:

id, 1
name, foo
cid, 1
cname, foo