当节点内部文本是html时,Java解析xml文件

时间:2013-08-19 21:32:47

标签: java html xml saxparser

现在我正在使用SAXParser和我自己的处理程序,它可以解析除了type =“html”之外的所有节点值

我的角色功能是这样的:

public void characters(char ch[], int start, int length) throws SAXException {
        if(content){
        String tmp = new String(ch, start, length);
        System.out.println("Content : " + tmp);
        content = false;
        }

并且该特定节点具有以下格式,我的输出总是只给我一堆\ n而不是别的。

   <content type="html">

    &lt;img alt="" src="http://cdn2.sbnation.com/entry_photo_images/8767829/stranger-bad-robot-screencap_large.png" /&gt;


     &lt;p&gt;Bad Robot, the production company founded by geek culture hitmaker J.J. Abrams (&lt;i&gt;Lost&lt;/i&gt;, &lt;i&gt;Fringe&lt;/i&gt;, &lt;i&gt;Star Trek: Into Darkness&lt;/i&gt;, &lt;i&gt;Alias&lt;/i&gt;,&amp;nbsp;etc.), has released a&amp;nbsp;&lt;a href="http://youtu.be/FWaAZCaQXdo" target="_blank"&gt;mysterious new trailer&lt;/a&gt; titled "Stranger." The creepy and inscrutable video spot, posted by the official Bad Robot Twitter account this afternoon, features a starry sky; a long-haired, rope-bound man wandering along a desolate monochromatic shore line; and your garden variety, horrifying stitched-mouth person coming into focus. "Men are erased and reborn," intones a narrator that sounds a little like Leonard Nimoy.&lt;/p&gt;
     &lt;p&gt;&lt;/p&gt;



    </content>

2 个答案:

答案 0 :(得分:1)

您应该使用StringBuffer来存储这些主题中描述的内容:

SAX parsing and special characters

Unable to read special characters from xml using java

答案 1 :(得分:1)

您可能错误地认为characters回调仅在startElementendElement回调之间发生一次。它实际上被多次调用。

由于你使用content布尔成员来确定是否打印东西,并且在false回调中将同一成员设置为characters,你的条件必然只能满足一次,直到你重置content(目前尚不清楚你在哪里这样做)。

这是一个可以很好地使用XML的示例(假设非混合内容和Java编程语言):

import java.io.IOException;
import java.io.StringReader;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class TestSaxParser {

    public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {
        String xml = 
            "<content type=\"html\">\n" +
            "\n" +
            "    &lt;img alt=\"\" src=\"http://cdn2.sbnation.com/entry_photo_images/8767829/stranger-bad-robot-screencap_large.png\" /&gt;\n" +
            "\n" +
            "\n" +
            "     &lt;p&gt;Bad Robot, the production company founded by geek culture hitmaker J.J. Abrams (&lt;i&gt;Lost&lt;/i&gt;, &lt;i&gt;Fringe&lt;/i&gt;, &lt;i&gt;Star Trek: Into Darkness&lt;/i&gt;, &lt;i&gt;Alias&lt;/i&gt;,&amp;nbsp;etc.), has released a&amp;nbsp;&lt;a href=\"http://youtu.be/FWaAZCaQXdo\" target=\"_blank\"&gt;mysterious new trailer&lt;/a&gt; titled \"Stranger.\" The creepy and inscrutable video spot, posted by the official Bad Robot Twitter account this afternoon, features a starry sky; a long-haired, rope-bound man wandering along a desolate monochromatic shore line; and your garden variety, horrifying stitched-mouth person coming into focus. \"Men are erased and reborn,\" intones a narrator that sounds a little like Leonard Nimoy.&lt;/p&gt;\n" +
            "     &lt;p&gt;&lt;/p&gt;\n" +
            "\n" +
            "\n" +
            "\n" +
            "    </content>";

        MySaxHandler handler = new MySaxHandler();
        SAXParserFactory factory = SAXParserFactory.newInstance();
        SAXParser parser = factory.newSAXParser();        
        InputSource source = new InputSource(new StringReader(xml));
        parser.parse(source, handler);
    }

    private static class MySaxHandler extends DefaultHandler {
        private StringBuilder content = new StringBuilder();

        @Override
        public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
            content.setLength(0);
        }

        @Override
        public void characters(char[] ch, int start, int length) throws SAXException {
            content.append(ch, start, length);
        }

        @Override
        public void endElement(String uri, String localName, String qName) throws SAXException {
            System.out.println(content.toString());
        }

    }    
}

输出:

    <img alt="" src="http://cdn2.sbnation.com/entry_photo_images/8767829/stranger-bad-robot-screencap_large.png" />


     <p>Bad Robot, the production company founded by geek culture hitmaker J.J. Abrams (<i>Lost</i>, <i>Fringe</i>, <i>Star Trek: Into Darkness</i>, <i>Alias</i>,&nbsp;etc.), has released a&nbsp;<a href="http://youtu.be/FWaAZCaQXdo" target="_blank">mysterious new trailer</a> titled "Stranger." The creepy and inscrutable video spot, posted by the official Bad Robot Twitter account this afternoon, features a starry sky; a long-haired, rope-bound man wandering along a desolate monochromatic shore line; and your garden variety, horrifying stitched-mouth person coming into focus. "Men are erased and reborn," intones a narrator that sounds a little like Leonard Nimoy.</p>
     <p></p>