直到最近,我的XML文件的标记结构相当简单。但现在我有一个带标签的额外标签级别,解析XML变得更加复杂。
以下是我的新XML文件的示例(我更改了标记名称以便于理解):
<SchoolRoster>
<Student>
<name>John</name>
<age>14</age>
<course>
<math>A</math>
<english>B</english>
</course>
<course>
<government>A+</government>
</course>
</Student>
<Student>
<name>Tom</name>
<age>13</age>
<course>
<gym>A</gym>
<geography>incomplete</geography>
</course>
</Student>
</SchoolRoster>
上面的XML的重要特征是我可以有多个“课程”属性,在里面我可以任意命名标签作为他们的孩子。并且可以有任意数量的这些孩子,我想读入“名称”,“价值”的HashMap。
public static TreeMap getAllSchoolRosterInformation(String fileName) {
TreeMap SchoolRoster = new TreeMap();
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
File file = new File(fileName);
if (file.exists()) {
Document doc = db.parse(file);
Element docEle = doc.getDocumentElement();
NodeList studentList = docEle.getElementsByTagName("Student");
if (studentList != null && studentList.getLength() > 0) {
for (int i = 0; i < studentList.getLength(); i++) {
Student aStudent = new Student();
Node node = studentList.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element e = (Element) node;
NodeList nodeList = e.getElementsByTagName("name");
aStudent.setName(nodeList.item(0).getChildNodes().item(0).getNodeValue());
nodeList = e.getElementsByTagName("age");
aStudent.setAge(Integer.parseInt(nodeList.item(0).getChildNodes().item(0).getNodeValue()));
nodeList = e.getElementsByTagName("course");
if (nodeList != null && nodeList.getLength() > 0) {
Course[] courses = new Course[nodeList.getLength()];
for (int j = 0; j < nodeList.getLength(); j++) {
Course singleCourse = new Course();
HashMap classGrades = new HashMap();
NodeList CourseNodeList = nodeList.item(j).getChildNodes();
for (int k = 0; k < CourseNodeList.getLength(); k++) {
if (CourseNodeList.item(k).getNodeType() == Node.ELEMENT_NODE && CourseNodeList != null) {
classGrades.put(CourseNodeList.item(k).getNodeName(), CourseNodeList.item(k).getNodeValue());
}
}
singleCourse.setRewards(classGrades);
Courses[j] = singleCourse;
}
aStudent.setCourses(Courses);
}
}
SchoolRoster.put(aStudent.getName(), aStudent);
}
}
} else {
System.exit(1);
}
} catch (Exception e) {
System.out.println(e);
}
return SchoolRoster;
}
我遇到的问题是,学生在“数学”中得到“A”而不是得到“数学”中的“A”。 (如果这篇文章太长,我可以尝试找一些缩短它的方法。)
答案 0 :(得分:5)
如果这是我的项目,我会避免尝试手动剖析HTML中的数据,而是让Java通过使用JAXB为我做。我使用这个工具越多,我就越喜欢它。我敦促你考虑尝试这个,因为如果你这样做,你需要将XML更改为Java对象是Java类中正确的注释,然后解组XML。使用的代码会更简单,因此更不容易出错。
例如,以下代码非常容易且干净地将信息编组到XML中:
import java.util.ArrayList;
import java.util.List;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;
@XmlRootElement
public class SchoolRoster {
@XmlElement(name = "student")
private List<Student> students = new ArrayList<Student>();
public SchoolRoster() {
}
public List<Student> getStudents() {
return students;
}
public void addStudent(Student student) {
students.add(student);
}
public static void main(String[] args) {
Student john = new Student("John", 14);
john.addCourse(new Course("math", "A"));
john.addCourse(new Course("english", "B"));
Student tom = new Student("Tom", 13);
tom.addCourse(new Course("gym", "A"));
tom.addCourse(new Course("geography", "incomplete"));
SchoolRoster roster = new SchoolRoster();
roster.addStudent(tom);
roster.addStudent(john);
try {
JAXBContext context = JAXBContext.newInstance(SchoolRoster.class);
Marshaller marshaller = context.createMarshaller();
marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
String pathname = "MySchoolRoster.xml";
File rosterFile = new File(pathname );
marshaller.marshal(roster, rosterFile);
marshaller.marshal(roster, System.out);
} catch (JAXBException e) {
e.printStackTrace();
}
}
}
@XmlRootElement
@XmlType(propOrder = { "name", "age", "courses" })
class Student {
// TODO: completion left as an exercise for the original poster
}
@XmlRootElement
@XmlType(propOrder = { "name", "grade" })
class Course {
// TODO: completion left as an exercise for the original poster
}
这产生了以下XML:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<schoolRoster>
<student>
<name>Tom</name>
<age>13</age>
<courses>
<course>
<name>gym</name>
<grade>A</grade>
</course>
<course>
<name>geography</name>
<grade>incomplete</grade>
</course>
</courses>
</student>
<student>
<name>John</name>
<age>14</age>
<courses>
<course>
<name>math</name>
<grade>A</grade>
</course>
<course>
<name>english</name>
<grade>B</grade>
</course>
</courses>
</student>
</schoolRoster>
要将其解组为一个充满数据的SchoolRoster类,只需要几行代码。
private static void unmarshallTest() {
try {
JAXBContext context = JAXBContext.newInstance(SchoolRoster.class);
Unmarshaller unmarshaller = context.createUnmarshaller();
String pathname = "MySchoolRoster.xml"; // whatever the file name should be
File rosterFile = new File(pathname );
SchoolRoster roster = (SchoolRoster) unmarshaller.unmarshal(rosterFile);
System.out.println(roster);
} catch (JAXBException e) {
e.printStackTrace();
}
}
将toString()
方法添加到我的课程后,结果是:
SchoolRoster
[students=
[Student [name=Tom, age=13, courses=[Course [name=gym, grade=A], Course [name=geography, grade=incomplete]]],
Student [name=John, age=14, courses=[Course [name=math, grade=A], Course [name=english, grade=B]]]]]
答案 1 :(得分:4)
for (int k = 0; k < CourseNodeList.getLength(); k++) {
if (CourseNodeList.item(k).getNodeType() == Node.ELEMENT_NODE && CourseNodeList != null) {
classGrades.put(CourseNodeList.item(k).getNodeName(),
CourseNodeList.item(k).getNodeValue());
}
}
您在getNodeValue()
上呼叫Element
。根据JDK API文档,返回null。
http://docs.oracle.com/javase/6/docs/api/org/w3c/dom/Node.html
您需要获取子Text节点并在其上调用getNodeValue()
。这是一个非常快速和肮脏的方法:
classGrades.put(CourseNodeList.item(k).getNodeName(),
CourseNodeList.item(k).getChildNodes().item(0).getNodeValue());
请不要在生产代码中使用它。它很丑。但它会指出你正确的方向。
答案 2 :(得分:1)
与@Hovercraft一样,我建议使用库来处理xml的序列化。我发现Xstream具有出色的性能和易用性。 http://x-stream.github.io/
例如:
public static void saveStudentsXML(FileOutputStream file) throws Exception {
if (xstream == null)
initXstream();
xstream.toXML(proctorDAO.studentList, file);
file.close();
}
public static void initXstream() {
xstream = new XStream();
xstream.alias("student", Student.class);
xstream.useAttributeFor(Student.class, "lastName");
xstream.useAttributeFor(Student.class, "firstName");
xstream.useAttributeFor(Student.class, "id");
xstream.useAttributeFor(Student.class, "gradYear");
xstream.aliasAttribute(Student.class, "lastName", "last");
xstream.aliasAttribute(Student.class, "gradYear", "gc");
xstream.aliasAttribute(Student.class, "firstName", "first");
}
示例XML以演示嵌套属性:
<list>
<student first="Ralf" last="Adams" gc="2014" id="100">
<testingMods value="1" boolMod="2"/>
</student>
<student first="Mick" last="Agosti" gc="2014" id="102">
<testingMods value="1" boolMod="2"/>
</student>
<student first="Edmund" last="Baggio" gc="2013" id="302">
<testingMods value="1" boolMod="6"/>
</student>
</list>