我正在尝试解析一个类似于以下内容的XML文件(代表电视指南)......
<?xml version="1.0" encoding="utf-8"?>
<channels>
<channel>
<name>BBC ONE</name>
<oid>10029</oid>
...
<programmes>
<programme>
<description>Blah blah blah</description>
<end_time>2013-02-04 01:40:00</end_time>
<episode>9</episode>
<genres>Entertainment</genres>
<oid>10583734</oid>
<season>8</season>
<start_time>2013-02-04 00:15:00</start_time>
<title>The Celebrity Apprentice USA</title>
</programme>
<programme>
..
</programme>
</programmes>
</channel>
<channel>
...
</channel>
</channels>
我正在使用两个解析器 - 一个用于通道,另一个用于程序,但显然这意味着我需要检索整个<programmes>...</programmes>
以将其传递给“程序”解析器。
我在'频道'解析器中尝试了以下内容......
public List<XMLTVChannel> parse() {
RootElement rootElement = new RootElement("channels");
final List<XMLTVChannel> channelsList = new ArrayList<XMLTVChannel>();
Element channelElement = rootElement.getChild("channel");
...
// Set the EndTextElementListeners for the <channel> child elements
channelElement.getChild(CHANNEL_OID).setEndTextElementListener(new EndTextElementListener() {
public void end(String body) {
currentChannel.setOid(body);
}
});
...
// HERE'S THE PROBLEM
channelElement.getChild("programmes").setEndTextElementListener(new EndTextElementListener() {
public void end(String body) {
// NEED TO INVOKE XMLTVProgrammeParser HERE
}
});
try {
Xml.parse(getInputStream(), Xml.Encoding.UTF_8, rootElement.getContentHandler());
} catch (Exception e) {
throw new RuntimeException(e);
}
return channelsList;
}
好的,所以我用谷歌搜索,我确切地知道问题是什么 - 传递到String body
方法的end(...)
参数应该只包含文本,而它是元素和文本的混合。 / p>
我已经阅读了一些类似的stackoverflow问题和文章,这些问题和文章表明我需要定义自己的ContentHandler
,但我没有发现任何类似于我正在尝试做的事情。自定义ContentHandler
是我唯一的选择还是有其他方式?
答案 0 :(得分:3)
你的意思是你想要这个输出:
BBC ONE
10029
------------------------
The Celebrity Apprentice USA
2013-02-04 00:15:00 - 2013-02-04 01:40:00
Entertainment
Season : 8 / Episode : 9
Description:
Blah blah blah
10583734
**********************
The Celebrity Apprentice USA
2013-02-04 01:45:00 - 2013-02-04 02:25:00
Entertainment
Season : 8 / Episode : 10
Description:
Blah blah blah
10583735
**********************
//////////////////////////
BBC TWO
10030
------------------------
American Dad
2013-02-04 00:30:00 - 2013-02-04 01:25:00
Cartoon
Season : 14 / Episode : 1
Description:
Blah blah blah
10583734
**********************
American Dad
2013-02-04 01:30:00 - 2013-02-04 02:15:00
Cartoon
Season : 14 / Episode : 2
Description:
Blah blah blah
10583735
**********************
//////////////////////////
我已经修改了你的xml文件:
<?xml version="1.0" encoding="utf-8"?>
<channels>
<channel>
<name>BBC ONE</name>
<oid>10029</oid>
<programmes>
<programme>
<description>Blah blah blah</description>
<end_time>2013-02-04 01:40:00</end_time>
<episode>9</episode>
<genres>Entertainment</genres>
<oid>10583734</oid>
<season>8</season>
<start_time>2013-02-04 00:15:00</start_time>
<title>The Celebrity Apprentice USA</title>
</programme>
<programme>
<description>Blah blah blah</description>
<end_time>2013-02-04 02:25:00</end_time>
<episode>10</episode>
<genres>Entertainment</genres>
<oid>10583735</oid>
<season>8</season>
<start_time>2013-02-04 01:45:00</start_time>
<title>The Celebrity Apprentice USA</title>
</programme>
</programmes>
</channel>
<channel>
<name>BBC TWO</name>
<oid>10030</oid>
<programmes>
<programme>
<description>Blah blah blah</description>
<end_time>2013-02-04 01:25:00</end_time>
<episode>1</episode>
<genres>Cartoon</genres>
<oid>10583734</oid>
<season>14</season>
<start_time>2013-02-04 00:30:00</start_time>
<title>American Dad</title>
</programme>
<programme>
<description>Blah blah blah</description>
<end_time>2013-02-04 02:15:00</end_time>
<episode>2</episode>
<genres>Cartoon</genres>
<oid>10583735</oid>
<season>14</season>
<start_time>2013-02-04 01:30:00</start_time>
<title>American Dad</title>
</programme>
</programmes>
</channel>
</channels>
Java类:
频道
public class Channel {
private String name;
private String oid;
private ArrayList<Programme> alProgrammes;
public Channel(){}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getOid() {
return oid;
}
public void setOid(String oid) {
this.oid = oid;
}
public ArrayList<Programme> getAlProgrammes() {
return alProgrammes;
}
public void setAlProgrammes(ArrayList<Programme> alProgrammes) {
this.alProgrammes = alProgrammes;
}
}
计划
public class Programme {
private String description;
private String end_time;
private String episode;
private String genres;
private String oid;
private String season;
private String start_time;
private String title;
public Programme() {
}
//Getters / Setters
public String getDescription() {
return description;
}
public void setDescription(String description) {
this.description = description;
}
public String getEnd_time() {
return end_time;
}
public void setEnd_time(String end_time) {
this.end_time = end_time;
}
public String getEpisode() {
return episode;
}
public void setEpisode(String episode) {
this.episode = episode;
}
public String getGenres() {
return genres;
}
public void setGenres(String genres) {
this.genres = genres;
}
public String getOid() {
return oid;
}
public void setOid(String oid) {
this.oid = oid;
}
public String getSeason() {
return season;
}
public void setSeason(String season) {
this.season = season;
}
public String getStart_time() {
return start_time;
}
public void setStart_time(String start_time) {
this.start_time = start_time;
}
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
}
XMLManager
public final class XMLManager {
public static ArrayList<Channel> getAlChannels(){
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = null;
Document doc = null;
ArrayList<Channel> alChannels = new ArrayList<>();
try {
db = dbf.newDocumentBuilder();
doc = db.parse(new File("D:\\Loic_Workspace\\Test2\\res\\test.xml"));
NodeList ndListChannels = doc.getElementsByTagName("channel");
Integer channelsCount = ndListChannels.getLength();
NodeList ndListChannel = null;
Integer ndListChannelLength = null;
Channel channel = null;
NodeList ndListProgrammes = null;
for(int i=0;i<channelsCount;i++){
ndListChannel = ndListChannels.item(i).getChildNodes();
ndListChannelLength = ndListChannel.getLength();
channel = new Channel();
for(int j=0;j<ndListChannelLength;j++){
Node currentNode = ndListChannel.item(j);
String currentNodeName = currentNode.getNodeName();
String value = currentNode.getTextContent();
if(currentNodeName.equals("name")){
channel.setName(value);
}
if(currentNodeName.equals("oid")){
channel.setOid(value);
}
if(currentNodeName.equals("programmes")){
ndListProgrammes = currentNode.getChildNodes();
ArrayList<Programme> alProgrammes = new ArrayList<>();
for(int k=0;k<ndListProgrammes.getLength();k++){
Node ndProgrammes = ndListProgrammes.item(k);
if(ndProgrammes.hasChildNodes()){
NodeList ndListProgramme = ndProgrammes.getChildNodes();
Integer ndListProgrammeLength = ndListProgramme.getLength();
Programme programme = new Programme();
for(int l=0;l<ndListProgrammeLength;l++){
Node ndProgramme = ndListProgramme.item(l);
String nodeProgrameName = ndProgramme.getNodeName();
String nodeProgrameValue = ndProgramme.getTextContent();
if(nodeProgrameName.equals("description")){
programme.setDescription(nodeProgrameValue);
}
if(nodeProgrameName.equals("end_time")){
programme.setEnd_time(nodeProgrameValue);
}
if(nodeProgrameName.equals("episode")){
programme.setEpisode(nodeProgrameValue);
}
if(nodeProgrameName.equals("genres")){
programme.setGenres(nodeProgrameValue);
}
if(nodeProgrameName.equals("oid")){
programme.setOid(nodeProgrameValue);
}
if(nodeProgrameName.equals("season")){
programme.setSeason(nodeProgrameValue);
}
if(nodeProgrameName.equals("start_time")){
programme.setStart_time(nodeProgrameValue);
}
if(nodeProgrameName.equals("title")){
programme.setTitle(nodeProgrameValue);
}
}
alProgrammes.add(programme);
}
}
channel.setAlProgrammes(alProgrammes);
}
}
alChannels.add(channel);
}
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return alChannels;
}
}
主
public class MyMain {
/**
* @param args
*/
public static void main(String[] args) {
ArrayList<Channel> alChannels = XMLManager.getAlChannels();
for(Channel c:alChannels){
System.out.println(c.getName());
System.out.println(c.getOid());
System.out.println("------------------------");
for(Programme p:c.getAlProgrammes()){
System.out.println(p.getTitle());
System.out.println(p.getStart_time()+" - "+p.getEnd_time());
System.out.println(p.getGenres());
System.out.println("Season : "+p.getSeason()+" / Episode : "+p.getEpisode());
System.out.println("Description:\n"+p.getDescription());
System.out.println(p.getOid());
System.out.println("**********************");
}
System.out.println("//////////////////////////");
}
}
}
以下是我如何使用SAX进行此操作的示例。
重要提示:我保留了课程计划和频道
ChannelsHandler
public class ChannelsHandler extends DefaultHandler{
private ArrayList<Channel> tvGuide;
private Channel channel;
private ArrayList<Programme> alProgrammes;
private Programme programme;
private String reading;
public ChannelsHandler(){
super();
}
@Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
if(qName.equals("channels")){
tvGuide = new ArrayList<>();
}else if(qName.equals("channel")){
channel = new Channel();
}
else if(qName.equals("channel")){
channel = new Channel();
}
else if(qName.equals("programmes")){
alProgrammes = new ArrayList<>();
}
else if(qName.equals("programme")){
programme = new Programme();
}
}
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
reading = new String(ch, start, length);
}
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
if(qName.equals("channel")){
tvGuide.add(channel);
channel = null;
}
if(qName.equals("name")){
channel.setName(reading);
}
else if(qName.equals("programmes")){
channel.setAlProgrammes(alProgrammes);
alProgrammes = new ArrayList<>();
}
else if(qName.equals("programme")){
alProgrammes.add(programme);
programme = null;
}
else if(qName.equals("description")){
programme.setDescription(reading);
}
else if(qName.equals("end_time")){
programme.setEnd_time(reading);
}
else if(qName.equals("episode")){
programme.setEpisode(reading);
}
else if(qName.equals("genres")){
programme.setGenres(reading);
}
else if(qName.equals("season")){
programme.setSeason(reading);
}
else if(qName.equals("start_time")){
programme.setStart_time(reading);
}
else if(qName.equals("title")){
programme.setTitle(reading);
}
}
public ArrayList<Channel> getTVGuide(){
return tvGuide;
}
}
我的新主
public static void main(String[] args) {
SAXParserFactory factory = SAXParserFactory.newInstance();
try {
SAXParser parser = factory.newSAXParser();
File file = new File("D:\\Loic_Workspace\\TestSAX\\res\\test.xml");
ChannelsHandler handler = new ChannelsHandler();
parser.parse(file,handler);
List<Channel> tvGuide = handler.getTVGuide();
for(Channel c:tvGuide){
System.out.println(c.getName());
System.out.println("------------------------");
for(Programme p:c.getAlProgrammes()){
System.out.println(p.getTitle());
System.out.println(p.getStart_time()+" - "+p.getEnd_time());
System.out.println(p.getGenres());
System.out.println("Season : "+p.getSeason()+" / Episode : "+p.getEpisode());
System.out.println("Description:\n"+p.getDescription());
System.out.println("**********************");
}
System.out.println("//////////////////////////");
}
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
我的控制台输出:
BBC ONE
------------------------
The Celebrity Apprentice USA
2013-02-04 00:15:00 - 2013-02-04 01:40:00
Entertainment
Season : 8 / Episode : 9
Description:
Blah blah blah
**********************
The Celebrity Apprentice USA
2013-02-04 01:45:00 - 2013-02-04 02:25:00
Entertainment
Season : 8 / Episode : 10
Description:
Blah blah blah
**********************
//////////////////////////
BBC TWO
------------------------
American Dad
2013-02-04 00:30:00 - 2013-02-04 01:25:00
Cartoon
Season : 14 / Episode : 1
Description:
Blah blah blah
**********************
American Dad
2013-02-04 01:30:00 - 2013-02-04 02:15:00
Cartoon
Season : 14 / Episode : 2
Description:
Blah blah blah
**********************
//////////////////////////
这是我第一次使用SAX。也许你可以找到更有效的东西,但它的工作:-) 我没有在我的更新中管理程序或频道的重复OID标记。