所以基本上,我在解决这个问题时遇到了问题。
基本上,我可以使用任何类型的xml ..我必须解析它。
我能够成功解析“flat”xml。 例如:
<emp>
<id>1</id>
<name>foo</name>
<age>22</age>
</emp>
我的简单解析器可以正常工作(注意架构是可变的......任何平面xml(没有硬编码)..
但它对嵌套的xml内容失败了 所以
<emplist>
<emp>
<manager>
<id>1</id>
<name>foo</name>
</manager>
</emp>
<emp>
<clerk>
<cid>1</cid>
<cname>foo</cname>
</clerk>
</emp>
</emplist>
我想要的第一个案例的输出(我得到的是)
id,1
name,foo
但是我希望
id, 1
name, foo
cid, 1
cname,foo
我如何平衡这一点。 感谢
当前代码:
public class XMLReader {
public static void main(String[] args) throws JDOMException, IOException {
//String xmlString = "<employee >\n <firstname xml:space=\"preserve\" >John</firstname>\n <lastname>Watson</lastname>\n <age>30</age>\n <email>johnwatson@sh.com</email>\n</employee>";
String xmlString = "<employee>\n" +
" <personal><id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>\n" +
" <name>Lareina</name>\n" +
" <age>50</age>\n" +
" </personal><contact><dept>Fusce</dept>\n" +
" <manager>B55E6DA8-76BD-A3C8-2DDF-686CB9A0BB76</manager></contact>\n" +
" </employee>";
System.out.println(xmlString);
SAXBuilder builder = new SAXBuilder();
Reader in = new StringReader(xmlString);
Document doc = builder.build(in);
Element root = doc.getRootElement();
List children = root.getChildren();
//System.out.println(children);
String value = "";
for (int i = 0; i < children.size(); i++) {
Element dataNode = (Element) children.get(i);
// Element dataNode = (Element) dataNodes.get(j);
value += ", " +dataNode.getText().trim();
System.out.println(dataNode.getName() + " : " + dataNode.getText());
//context.write(new Text(rowKey.toString()), new Text(node.getName().trim() + " " + node.getText().trim()));
}
//System.out.println(in);
}
}
答案 0 :(得分:2)
基于StAX而不是DOM的简单实现。但是如果你愿意,你可以很容易地将它转换为DOM(你需要使用递归)。
import java.io.IOException;
import java.io.StringReader;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.XMLEvent;
public class FlattenXmlExample
{
private static XMLInputFactory inFactory = XMLInputFactory.newInstance();
public static void main(String[] args) throws XMLStreamException, IOException
{
String xmlRecord =
"<emplist>\n" +
"<emp>\n" +
" <manager>\n" +
" <id>1</id>\n" +
" <name>foo</name>\n" +
" </manager>\n" +
"</emp>\n" +
"<emp>\n" +
" <clerk>\n" +
" <cid>1</cid>\n" +
" <cname>foo</cname>\n" +
" </clerk>\n" +
"</emp>\n" +
" </emplist>";
String flatXmlRecord = flattenXmlRecord(xmlRecord);
System.out.print(flatXmlRecord);
}
private static String flattenXmlRecord(final String xmlRecord) throws XMLStreamException
{
StringBuilder flatXmlRecord = new StringBuilder();
XMLEventReader eventReader = inFactory.createXMLEventReader(new StringReader(xmlRecord));
while (eventReader.hasNext())
{
XMLEvent event = eventReader.nextEvent();
if (event.getEventType() == XMLEvent.START_ELEMENT )
{
String elementName = event.asStartElement().getName().getLocalPart();
event = eventReader.nextEvent();
if(event.getEventType() == XMLEvent.CHARACTERS)
{
if(!event.asCharacters().getData().trim().isEmpty())
{
flatXmlRecord.append(elementName + ", " + event.asCharacters().getData() + "\n");
}
}
}
}
return flatXmlRecord.toString();
}
}
输入:
<emplist>
<emp>
<manager>
<id>1</id>
<name>foo</name>
</manager>
</emp>
<emp>
<clerk>
<cid>1</cid>
<cname>foo</cname>
</clerk>
</emp>
</emplist>
输出:
id, 1
name, foo
cid, 1
cname, foo