一种对字符串进行反序列化的方法

时间:2015-12-31 09:44:08

标签: c#

我有一个从文本文件中读取的自定义格式化字符串,该文件有多次出现的模板实例。

澄清

我有一个字符串模板

--------------------
Id : {0}
Value : {1}
--------------------

我读过一个文本文件,其内容如下

--------------------
Id : 21
Value : Some Value 1
--------------------
--------------------
Id : 200
Value : Some Value 2
--------------------
--------------------
Id : 1
Value : Some Value 3
--------------------
--------------------
Id : 54
Value : Some Value 4
--------------------

我有class A,它有2个公共属性Id和Value

class A
{
    public string Id { get; set; }
    public string Value { get; set; }
}

是否可以将从文本文件中读取的整个文本反序列化为List<A>

没有“for”“foreach”或“while”循环的方法会更好。

2 个答案:

答案 0 :(得分:2)

我已经解析了这样的文本文件40年了。他是最好的方法

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        enum State
        {
            FIND_ID,
            FIND_VALUE
        }
        const string FILENAME = @"c:\temp\test.txt";
        static void Main(string[] args)
        {
            List<A> a_s = new List<A>();
            string inputLine = "";
            StreamReader reader = new StreamReader(FILENAME);
            State state = State.FIND_ID;
            A a = null;
            while ((inputLine = reader.ReadLine()) != null)
            {
                inputLine = inputLine.Trim();
                if (!inputLine.StartsWith("-") && inputLine.Length > 0)
                {
                    switch (state)
                    {
                        case State.FIND_ID :
                            if (inputLine.StartsWith("Id"))
                            {
                                string[] inputArray = inputLine.Split(new char[] { ':' });
                                a = new A();
                                a_s.Add(a);
                                a.Id = inputArray[1].Trim();
                                state = State.FIND_VALUE;
                            }
                            break;
                        case State.FIND_VALUE:
                            if (inputLine.StartsWith("Value"))
                            {
                                string[] inputArray = inputLine.Split(new char[] { ':' });
                                a.Value = inputArray[1].Trim();
                                state = State.FIND_ID;
                            }
                            break;
                    }
                }
            }
        }
    }
    class A
    {
        public string Id { get; set; }
        public string Value { get; set; }
    }
}
​

答案 1 :(得分:1)

如果你可以修改你的A类,使其具有如下构造函数:

class A
{
    public string Id { get; set; }
    public string Value { get; set; }

    public A() { }

    public A(string s)
    {
        string[] vals = s.Split((new string[] { "\r\n" }), StringSplitOptions.RemoveEmptyEntries);
        this.Id = vals[0].Replace("Id : ", string.Empty).Trim();
        this.Value = vals[1].Replace("Value : ", string.Empty).Trim();
    }

    // only overridden here for printing
    public override string ToString()
    {
        return string.Format("Id : {0}\r\nValue : {1}\r\n", this.Id, this.Value);
    }
}

您可以实施以下内容:

public static List<A> GetValues(string file)
{
    List<string> vals = new List<string>(Regex.Split(System.IO.File.ReadAllText(file), "--------------------"));
    vals.RemoveAll(delegate(string s) { return string.IsNullOrEmpty(s.Trim()); });
    List<A> ret = new List<A>();
    vals.ForEach(delegate(string s) { ret.Add(new A(s)); });
    return ret;
}

public static void Main()
{
    foreach (A a in GetValues(@"C:\somefile.txt")) {
        Console.WriteLine(a);
    }
}

您的原始问题要求避免循环;这没有显式的循环结构(forforeachdo/while),但底层代码循环(例如Regex.Split,{{1} }和vals.RemoveAll都是循环),正如评论所指出的那样,在这种情况下你无法真正避免循环。

应该注意的是,在一些基准测试之后,如果要读取的文件是您指定的确切格式,则此方法会非常快。作为比较,我创建了一个文件并复制/粘贴了您的示例模板(您发布的4个结果),共计1032个结果,文件大小约为75k,XML文件导致大约65k(由于文本较少) vals.ForEach),我编写了以下基准测试来运行:

---

为清楚起见,以下是使用5400 RPM HDD(碎片大约0.1%)在Intel i7 @ 2.2 GHz上运行时的结果:

  

public class A { public string Id { get; set; } public string Value { get; set; } public A() { } public A(string s) { string[] vals = s.Split((new string[] { "\r\n" }), StringSplitOptions.RemoveEmptyEntries); this.Id = vals[0].Replace("Id : ", string.Empty).Trim(); this.Value = vals[1].Replace("Value : ", string.Empty).Trim(); } public A(string id, string val) { this.Id = id; this.Value = val; } // only overridden here for printing public override string ToString() { return string.Format("Id : {0}\r\nValue : {1}\r\n", this.Id, this.Value); } } public static List<A> GetValuesRegEx(string file) { List<string> vals = new List<string>(Regex.Split(System.IO.File.ReadAllText(file), "--------------------")); vals.RemoveAll(delegate(string s) { return string.IsNullOrEmpty(s.Trim()); }); List<A> ret = new List<A>(); vals.ForEach(delegate(string s) { ret.Add(new A(s)); }); return ret; } public static List<A> GetValuesXml(string file) { List<A> ret = new List<A>(); System.Xml.Serialization.XmlSerializer srl = new System.Xml.Serialization.XmlSerializer(ret.GetType()); System.IO.FileStream f = new System.IO.FileStream(file, System.IO.FileMode.OpenOrCreate, System.IO.FileAccess.ReadWrite, System.IO.FileShare.ReadWrite); ret = ((List<A>)srl.Deserialize(f)); f.Close(); return ret; } public static List<A> GetValues(string file) { List<A> ret = new List<A>(); List<string> vals = new List<string>(System.IO.File.ReadAllLines(file)); for (int i = 0; i < vals.Count; ++i) { if (vals[i].StartsWith("---") && ((i + 3) < vals.Count) && (vals[i + 3].StartsWith("---"))) { ret.Add(new A(vals[i + 1].Replace("Id : ", string.Empty), vals[i + 2].Replace("Value : ", string.Empty))); i += 3; } } return ret; } public static List<A> GetValuesStream(string file) { List<A> ret = new List<A>(); string line = ""; System.IO.StreamReader reader = new System.IO.StreamReader(file); int state = 0; A a = null; while ((line = reader.ReadLine()) != null) { line = line.Trim(); if (!line.StartsWith("-") || line.Length > 0) { switch (state) { case 0: if (line.StartsWith("Id")) { string[] inputArray = line.Split(new char[] { ':' }); a = new A(); ret.Add(a); a.Id = inputArray[1].Trim(); state = 1; } break; case 1: if (line.StartsWith("Value")) { string[] inputArray = line.Split(new char[] { ':' }); a.Value = inputArray[1].Trim(); state = 0; } break; } } } return ret; } public static void Main() { System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch(); for (int x = 0; x < 5; ++x) { double avg = 0d; for (int i = 0; i < 100; ++i) { sw.Restart(); List<A> txt = GetValuesRegEx(@"C:\somefile.txt"); sw.Stop(); avg += sw.Elapsed.TotalSeconds; } Console.WriteLine(string.Format("avg: {0} s", (avg / 100))); // best out of 5: 0.002380452 s avg = 0d; sw.Stop(); for (int i = 0; i < 100; ++i) { sw.Restart(); List<A> txt = GetValuesXml(@"C:\somefile.xml"); sw.Stop(); avg += sw.Elapsed.TotalSeconds; } Console.WriteLine(string.Format("avg: {0} s", (avg / 100))); // best out of 5: 0.002042312 s avg = 0d; sw.Stop(); for (int i = 0; i < 100; ++i) { sw.Restart(); List<A> xml = GetValues(@"C:\somefile.xml"); sw.Stop(); avg += sw.Elapsed.TotalSeconds; } Console.WriteLine(string.Format("avg: {0} s", (avg / 100))); // best out of 5: 0.001148025 s avg = 0d; sw.Stop(); for (int i = 0; i < 100; ++i) { sw.Restart(); List<A> txt = GetValuesStream(@"C:\somefile.txt"); sw.Stop(); avg += sw.Elapsed.TotalSeconds; } Console.WriteLine(string.Format("avg: {0} s", (avg / 100))); // best out of 5: 0.002459861 s avg = 0d; sw.Stop(); } sw.Stop(); } 运行时最佳平均值为5次:0.002380452 s

     

GetValuesRegEx运行时最佳平均值为5次:0.002042312 s

     

GetValuesXmlGetValues / loop)运行时最佳平均值为5次:0.001148025 s

     

ReadAllLinesGetValuesStream / loop)运行时最佳平均值为5次:0.002459861 s

您的结果可能会有所不同,但这并未考虑 任何 错误处理,因此您在使用代码时需要考虑到这一点。< / p>

希望可以提供帮助。