我是XML解析的新手。我已经阅读了有关DOM和SAX解析器的内容,并尝试了一些示例实现。但是,我无法解析以下XML数据
<?xml version="1.0" ?>
<collection>
<action value="submit"/>
<protocol_version value="1"/>
<reponse value="Success"/>
<batch>
<sample>
<count value="1"/>
<count2 value="2"/>
<count3 value="3"/>
</sample>
<sample_2>
<date value="10/10/2010"/>
<page value="SampleData"/>
<track value="123123123"/>
<same value="1.00"/>
<data>
<first_name value="Jeffrey"/>
<SSID value="1231231231"/>
<last_name value="Chuckle"/>
<field1 value="123123123"/>
<field2 value="Sam E. Bonzella"/>
<field3 value="SOME VALUE"/>
<field4 value="SOME VALUE 2"/>
<field5 value="TEXT"/>
<field6 value="12312"/>
</data>
</sample_2>
</batch>
</collection>
下面是我尝试实现的示例代码,但它需要重复代码,而且数据没有组织。我也尝试过JAXB解析器但无法获取value属性。
public class test {
public static void main(String[] args){
try {
File inputFile = new File("staff.xml");
DocumentBuilderFactory dbFactory
= DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
doc.getDocumentElement().normalize();
System.out.println("Base :"
+ doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("action");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("Element :"
+ nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("Action : "
+ eElement.getAttribute("value"));
}
}
nList = doc.getElementsByTagName("transaction_count");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("Element :"
+ nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("transaction_count : "
+ eElement.getAttribute("value"));
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
理想情况下,我希望将数据解析为数组,或者可能是Map。
答案 0 :(得分:3)
getElementsByTagName(String name)在这种情况下没用,因为应该提供所有标记名称。
上面的XML包含可以分为两类的元素:
带有值的元素 - 如果我理解正确的话,标记名和值应存储在地图中
没有值的元素。它们包含其他元素。不应存储标记名。
可以递归地解析元素。如果element包含属性“value”,那么它应该存储在map中。否则,应检查该元素的子节点。
public static void main(String argv[]) {
Map<String, String> map = new LinkedHashMap<>();
try {
File fXmlFile = new File("staff.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList collectionNodeList = doc.getElementsByTagName("collection");
Element collectionElement = (Element) collectionNodeList.item(0);
findElementsWithValues(map, collectionElement);
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("Found values: " + map.size());
System.out.println(map);
}
private static void findElementsWithValues(Map<String, String> map, Element rootElement) {
NodeList childNodes = rootElement.getChildNodes();
for (int i = 0; i < childNodes.getLength(); i++) {
Node node = childNodes.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element element = (Element) node;
String value = element.getAttribute("value");
if (!value.isEmpty()) {
String tagName = element.getTagName();
map.put(tagName, value);
}else{
findElementsWithValues(map, element);
}
}
}
}
输出(在上面的XML文件中进行更正以使其可解析之后)
Found values: 19
{action=submit, protocol_version=1, reponse=Success, count=1, count2=2, count3=3, date=10/10/2010, page=SampleData, track=123123123, same=1.00, first_name=Jeffrey, SSID=1231231231, last_name=Chuckle, field1=123123123, field2=Sam E. Bonzella, field3=SOME VALUE, field4=SOME VALUE 2, field5=TEXT, field6=12312}