Question

我尝试使用sax解析器从输入流解析xml。输入流从套接字连续传入xml。 '\ n'用作xml数据之间的分隔符。这就是xml的样子

<?xml version="1.0" encoding="UTF-8"?>
<response processor="header" callback="comheader">
    <properties>
        <timezone>Asia%2FBeirut</timezone>
        <rawoffset>7200000</rawoffset>
        <to_date>1319256000000</to_date>
        <dstrawoffset>10800000</dstrawoffset>
    </properties>
</response>
\n
<event type="progress" time="1317788744214">
    <param key="callback">todayactions</param>
    <param key="percent">10</param>
    <param key="msg">MAPPING</param>
</event>
<event type="progress" time="1317788744216">
    <param key="callback">todayactions</param>
    <param key="percent">20</param><param key="msg">MAPPING</param>
</event>
\n
<?xml version="1.0" encoding="UTF-8"?>
<response processor="header" callback="comheader">
    <properties>
        <timezone>Asia%2FBeirut</timezone>
        <rawoffset>7200000</rawoffset>
        <to_date>1319256000000</to_date>
        <dstrawoffset>10800000</dstrawoffset>
    </properties>
</response>

这对我们的iphone项目非常有效，因为我们将字符放到\ n并将其存储在字符串中并使用dom解析器。

但是当我尝试为android做这个时，字符串不是一个选项，因为它给了我们OutOfMemory异常。所以我们将输入流直接设置到它工作的SaxParser，直到\ n字符，之后它给我们异常

org.apache.harmony.xml.ExpatParser $ ParseException：在第2行，列 0：文档元素之后的垃圾

所以我尝试过滤输入流以跳过'\ n'字符。我创建了一个FilterStreamReader，但是我没有成功，似乎我的read函数没有完成这项工作。这是我的代码。

public class FilterStreamReader extends InputStreamReader {
    public FilterStreamReader(InputStream in, String enc)
            throws UnsupportedEncodingException {
        super(in, enc);
    }

    @Override
    public int read(char[] cbuf, int off, int len) throws IOException {
        int read = super.read(cbuf, off, len);
        Log.e("Reader",Character.toString((char)read));
        if (read == -1) {
            return -1;
        }

        int pos = off - 1;
        for (int readPos = off; readPos < off + read; readPos++) {
            if (read == '\n') {
                pos++;
            } else {                
                continue;
            }
            if (pos < readPos) {
                cbuf[pos] = cbuf[readPos];
            }
        }
        return pos - off + 1;
}

有人可以帮助我过滤输入流的\ n

修改基于格雷厄姆所说的我能够通过删除所有文档类型并添加我自己的开始和结束标记来解析整个数据。所以我不确定我的问题是不是单独过滤'\ n'。你如何解析像这样的xml？

Answer 1

问题不在于\n。这是在第一个</response>标记之后，它认为文档已完成。

此数据不是有效的XML。您应该将所有内容包装在单个顶级节点中。此外，您不能在文档的中途获得第二个<?xml version="1.0" encoding="UTF-8"?>声明。

从输入流中过滤\ n字符

1 个答案: