XML - System.Xml.XmlException - 十六进制值0x06

时间:2015-02-23 14:11:55

标签: c# .net xml xml-parsing xmlreader

我收到此错误。后来我搜索并发现了我的XML及其解决方案中非法字符的原因。但是我没有权限编辑这些文件。我的工作是读取和获取标签值,属性值和类似的东西。所以我无法使用&#01替换带有'\ x01'等转义符的二进制字符。我还尝试在XMLreader设置中包含CheckCharacters = false。它不需要这个。仍然是在抛出同样的错误。

是否无法修复XMLreader?我读到了关于XMLtextReader的内容。它可以跳过异常。但是我已经使用XMLreader编写了所有功能。如果我能为此找到解决方案,那将是件好事。否则我将不得不改变我的所有代码。

我的代码:

  private void button1_Click(object sender, EventArgs e)
        {
            int i = 0;
            var filenames = System.IO.Directory
                        .EnumerateFiles(textBox1.Text, "*.xml", System.IO.SearchOption.AllDirectories)
                        .Select(System.IO.Path.GetFullPath);



            foreach (var f in filenames)
            {
                var resolver = new XmlUrlOverrideResolver();
                resolver.DtdFileMap[@"X1.DTD"] = @"\\location\X1.DTD";
                resolver.DtdFileMap[@"R2.DTD"] = @"\\location\X2.DTD";
                resolver.DtdFileMap[@"R5.DTD"] = @"\\location\R5.DTD";
                XmlReaderSettings settings = new XmlReaderSettings();

                settings.DtdProcessing = DtdProcessing.Parse;
                settings.XmlResolver = resolver;
                XmlReader doc = XmlReader.Create(f, settings);
                while (doc.Read())
                {
                    if ((doc.NodeType == XmlNodeType.Element) && (doc.Name == "ap"))
                {
                    if (doc.HasAttributes)
                    {

                        String fin = doc.GetAttribute("ap");
                        if (fin == "no")
                        {


                            String[] array = new String[10000];
                            array[i] = (f);

                            File.AppendAllText(@"\\location\NAPP.txt", array[i] + Environment.NewLine);
                            i++;
                        }
                        else
                        {
                            String[] abs = new String[10000];
                            abs[i] = (f);
                            File.AppendAllText(@"\\location\APP.txt", abs[i] + Environment.NewLine);
                            i++;
                        }
                    }

                }
            }
        }

        MessageBox.Show("Done");
    }

2 个答案:

答案 0 :(得分:2)

这是一个非常简单的字符示例" filter"这将用空格重放0x06字符:

public class MyStreamReader : StreamReader {
    public MyStreamReader(string path)
        : base(path) {
    }

    public override int Read(char[] buffer, int index, int count) {            
        int res = base.Read(buffer, index, count);

        for (int i = 0; i < res; i++) {
            if (buffer[i] == 0x06) {
                buffer[i] = ' ';
            }
        }

        return res;
    }
}

你这样使用它:

using (var sr = new MyStreamReader(f)) {
    var doc = XmlReader.Create(sr, settings);

请注意,它非常简单,因为它将一个字符(0x06)替换为另一个字符(#34;长度&#34; (空间)。如果你想用&#34;序列替换一个字符&#34;角色(逃避它),它会变得更复杂(不是不可能,30分钟的工作困难)

(我已经检查过,似乎XmlTextReader仅使用该方法,而不是Read()方法)

与往常一样,当程序员告诉你30分钟时,它意味着0分钟或2小时: - )

这是更复杂的&#34; ReplacingStreamReader

/// <summary>
/// Only the Read methods are supported!
/// </summary>
public class ReplacingStreamReader : StreamReader
{
    public ReplacingStreamReader(string path)
        : base(path)
    {
    }

    public Func<char, string> ReplaceWith { get; set; }

    protected char[] RemainingChars { get; set; }
    protected int RemainingCharsIndex { get; set; }


    public override int Read()
    {
        int ch;

        if (RemainingChars != null)
        {
            ch = RemainingChars[RemainingCharsIndex];
            RemainingCharsIndex++;

            if (RemainingCharsIndex == RemainingChars.Length)
            {
                RemainingCharsIndex = 0;
                RemainingChars = null;
            }
        }
        else
        {
            ch = base.Read();

            if (ch != -1)
            {
                string replace = ReplaceWith((char)ch);

                if (replace == null)
                {
                    // Do nothing
                }
                else if (replace.Length == 1)
                {
                    ch = replace[0];
                }
                else
                {
                    ch = replace[0];

                    RemainingChars = replace.ToCharArray(1, replace.Length - 1);
                    RemainingCharsIndex = 0;
                }
            }
        }

        return ch;
    }

    public override int Read(char[] buffer, int index, int count)
    {
        int res = 0;

        // We leave error handling to the StreamReader :-)
        // We handle only "working" parameters
        if (RemainingChars != null && buffer != null && index >= 0 && count > 0 && index + count <= buffer.Length)
        {
            int remainingCharsCount = RemainingChars.Length - RemainingCharsIndex;
            res = Math.Min(remainingCharsCount, count);

            Array.Copy(RemainingChars, RemainingCharsIndex, buffer, index, res);

            RemainingCharsIndex += res;

            if (RemainingCharsIndex == RemainingChars.Length)
            {
                RemainingCharsIndex = 0;
                RemainingChars = null;
            }

            if (res == count)
            {
                return res;
            }

            index += res;
            count -= res;
        }

        while (true)
        {
            List<char> sb = null;

            int res2 = base.Read(buffer, index, count);

            if (res2 == 0 || ReplaceWith == null)
            {
                return res;
            }

            int j = 0;

            for (int i = 0; i < res2; i++)
            {
                char ch = buffer[index + i];
                string replace = ReplaceWith(ch);

                if (sb != null)
                {
                    if (replace == null)
                    {
                        sb.Add(ch);
                    }
                    else
                    {
                        sb.AddRange(replace);
                    }
                }
                else if (replace == null)
                {
                    buffer[j] = ch;
                    j++;
                }
                else if (replace.Length == 1)
                {
                    buffer[j] = replace[0];
                    j++;
                }
                else if (replace.Length == 0)
                {
                    // We do not advance
                }
                else
                {
                    sb = new List<char>();
                    sb.AddRange(replace);
                }
            }

            res2 = j;

            if (sb != null)
            {
                int res3 = Math.Min(sb.Count, count - res2);
                sb.CopyTo(0, buffer, index + res2, res3);

                if (res3 < sb.Count)
                {
                    RemainingChars = new char[sb.Count - res3];
                    RemainingCharsIndex = 0;
                    sb.CopyTo(res3, RemainingChars, 0, RemainingChars.Length);
                }

                res += res3;
            }
            else
            {
                res2 = j;

                // Can't happen if sb != null (at least a character must
                // have been added)
                if (res2 == 0)
                {
                    continue;
                }
            }

            res += res2;
            return res;
        }
    }
}

使用它像:

using (var sr = new ReplacingStreamReader(f))
{
    sr.ReplaceWith = x =>
    {
        return x == 0x6 ? " " : null;
        // return x == '.' ? "&#160;" : null; // Replace all . with &nbsp;
    };

    var doc = XmlReader.Create(sr, settings);

请注意ReplacingStreamReader没有&#34;知道&#34;它正在修改的xml的哪一部分,所以很少有&#34; blind&#34;替换是好的:-)除了这个限制,你可以用任何字符串替换任何字符(null表示{#1}}表示&#34;保留当前字符&#34;,相当于{{1在给出的示例中。返回ReplaceWith有效,表示删除当前字符。)

该课程非常有趣:它为x.ToString()提供了已读取的字符(并由string.Empty过滤),但char[] RemainingChars尚未返回方法,因为传递的缓冲区太小(ReplaceWith方法可以&#34;放大&#34;读取字符串,使其对Read()来说太大了!)。请注意,ReplaceWithbuffer而不是sb。在代码方面,可能使用其中一个或几乎相同。

答案 1 :(得分:0)

您可以先将内容读入string替换(转义)内容,然后将其加载到XmlReader

foreach (var f in filenames) {
    string text;
    using (StreamReader s = new StreamReader(f,Encoding.UTF8)) {
        text = s.ReadToEnd();
    }
    text = text.Replace("\x01",@"&#01"); //replace the content

    //load some settings
    var resolver = new XmlUrlOverrideResolver();
    resolver.DtdFileMap[@"X1.DTD"] = @"\\location\X1.DTD";
    resolver.DtdFileMap[@"R2.DTD"] = @"\\location\X2.DTD";
    resolver.DtdFileMap[@"R5.DTD"] = @"\\location\R5.DTD";
    XmlReaderSettings settings = new XmlReaderSettings();

    settings.DtdProcessing = DtdProcessing.Parse;
    settings.XmlResolver = resolver;
    XmlReader doc = XmlReader.Create(text, settings);

    //perform processing task
    //...
}