SAXParser未能获取特定数据

时间:2012-05-22 11:27:11

标签: java android xml-parsing saxparser

我正在尝试解析一个如下所示的XML文件:

<?xml version="1.0" encoding="utf-8"?>
<downloaddata>
    <downloaditem itemid="1">
    <title>Abdul kalaam Inspirational Talk</title>
    <downloadlink>http://o-o.preferred.spectranet-blr1.v8.lscache4.c.youtube.com/videoplayback?upn=Rxb-DvFeBTE&sparams=cp%2Cid%2Cip%2Cipbits%2Citag%2Cratebypass%2Csource%2Cupn%2Cexpire&fexp=906512%2C907217%2C907335%2C921602%2C919306%2C919316%2C904455%2C919324%2C904452&itag=18&ip=203.0.0.0&signature=96D7FA17DF684B4C2CD30F12251F3263C83EC443.05F62E98E1059BB44459ABF319F50DC4B7E6D90E&sver=3&ratebypass=yes&source=youtube&expire=1337691481&key=yt1&ipbits=8&cp=U0hSTFZUT19NS0NOMl9OTlNFOmlwaTFSSGFfd3NK&id=67ffa1d50864f57d&title=Abdul%20Kalam%20inspirational%20Speech%20on%20Leadership%20and%20Motivation</downloadlink>
    </downloaditem>
</downloaddata>

downloadlink标记的数据如上所示,解析似乎失败了。我试图用相同长度的其他东西替换数据,并且它可以工作。

以下是我正在使用的Android代码。

import java.io.File;
import java.io.IOException;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;

import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import android.os.Environment;

public class Wilxmlparser extends DefaultHandler{

List<VideoDetails> downloadList;
private String tempVal;
private VideoDetails tempVidDet;

public Wilxmlparser(){

}

public void parseXML() {

//get a factory
SAXParserFactory spf = SAXParserFactory.newInstance();
try {

    //get a new instance of parser
    SAXParser sp = spf.newSAXParser();

    File downloadInfo =new         File(Environment.getExternalStorageDirectory()+"/watchitlater/config/downloadinfo1.xml");        
    //parse the file and also register this class for call backs
    sp.parse(downloadInfo, this);

}catch(SAXException se) {
    se.printStackTrace();
}catch(ParserConfigurationException pce) {
    pce.printStackTrace();
}catch (IOException ie) {
    ie.printStackTrace();
}
}


//Event Handlers
@Override
public void startElement(String uri, String localName, String qName, Attributes     attributes) throws SAXException {
//reset
tempVal = "";
if(qName.equalsIgnoreCase("downloaditem")) {
    tempVidDet = new VideoDetails();
    tempVidDet.setItemId(Integer.parseInt(attributes.getValue("itemid")));
    }
}

@Override
public void characters(char[] ch, int start, int length) throws SAXException {
tempVal = new String(ch,start,length);
}

@Override
public void endElement(String uri, String localName, String qName) throws SAXException                 {

if(qName.equalsIgnoreCase("downloaditem")) {
downloadList.add(tempVidDet);
}else if (qName.equalsIgnoreCase("title")) {
    tempVidDet.setTitle(tempVal);
}else if (qName.equalsIgnoreCase("downloadlink")) {
    tempVidDet.setDownloadLink(tempVal);        
    }
}
}

上面的代码没有为上面的xml文件回复endElement。 但是如果xml就像

那样
<?xml version="1.0" encoding="utf-8"?>
<downloaddata>
    <downloaditem itemid="1">
        <title>Abdul kalaam Inspirational Talk</title>
        <downloadlink>http://www.gmail.com/hello/world/sdfsdf%20.@@%!@#    ($dwe</downloadlink>
    </downloaditem>
</downloaddata>

<?xml version="1.0" encoding="utf-8"?>
<downloaddata>
    <downloaditem itemid="1">
        <title>Abdul kalaam Inspirational Talk</title>
            <downloadlink>httphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttpa</downloadlink>
    </downloaditem>
</downloaddata>

然后它工作正常。我做错了什么?

2 个答案:

答案 0 :(得分:1)

答案 1 :(得分:1)

您的解析器无法解析有问题的xml的原因是它是无效的xml。导致问题的数据部分包含必须转义的字符。有关详细信息,请参阅维基百科关于XML的文章中的Characters and escaping

最好在生成xml的任何内容中进行更正,最简单的修复方法是将有问题的文本包装在CDATA section中。

修复数据后,您可能还会看到由解析代码中的误解引起的问题。

@Override
public void characters(char[] ch, int start, int length) throws SAXException {
   tempVal = new String(ch,start,length);
}

并不总是获取开始和结束标记之间的所有字符,因为此方法的契约允许多次调用它。您需要附加到startElement方法中初始化并在endElement方法中使用的字符串缓冲区,而不是简单地复制到字符串中。

有关此characters方法解析问题的更多信息,请参阅my answer to another SO question