将数据序列化为XML和CSV

时间:2017-03-11 16:07:43

标签: c# xml csv serialization

我有两个问题。我需要将数据序列化为csv和xml,但结果对我来说有问题。

作为xml,我想得到类似的东西:

<sentence>
 <word>example1</word>
 <word>example2</word>
 <word>example3</word>
</sentence>
<sentence>
 <word>example1</word>
 <word>example2</word>
 <word>example3</word>
</sentence>

我的数据是SentencedModel,它包含WordsModel的内部集合。所以它像:List<ICollection<string>>.列表中的每个位置(句子)都有字符串(单词)的集合。 类看起来像:

[Serializable]
public class WordsModel : IEnumerable<string>
{
    [XmlRoot("Word")]
    public ICollection<string> Words { get; set;}

    public IEnumerator<string> GetEnumerator()
    {
        return this.Words.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return this.Words.GetEnumerator();
    }
}

[Serializable]
public class SentencedModel : IEnumerable<WordsModel>
{
    [XmlArray("Sentence"), XmlArrayItem(typeof(WordsModel), ElementName = "Words")]
    public ICollection<WordsModel> Sentences { get; set; }

    public SentencedModel()
    {
        this.Sentences = new List<WordsModel>();
    }

    public void Add(WordsModel words)
    {
        this.Sentences?.Add(words);
    }

    public IEnumerator<WordsModel> GetEnumerator()
    {
        return this.Sentences.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return this.Sentences.GetEnumerator();
    }
}

我的类是存储库:

public class WordsSeperapedBySentence
{
    public SentencedModel WordsSeperatedBySentence { get; }

    public WordsSeperapedBySentence()
    {
        this.WordsSeperatedBySentence = new SentencedModel();
    }

    public bool AddSentence(ICollection<string> words)
    {
        if (words == null) return false;
        WordsModel wordsModel = new WordsModel();
        wordsModel.Words = words;
        this.WordsSeperatedBySentence.Add(wordsModel);
        return true;
    }
}

这是我的序列化程序类:

public class SerializeData
{
    public string SerializeToXml(SentencedModel data)
    {
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(SentencedModel));
        using (StringWriter textWriter = new StringWriter())
        {
            xmlSerializer.Serialize(textWriter, data);
            return textWriter.ToString();
        }
    }

    public ToCsv(WordsSeperapedBySentence data)
    {
        //??
    }
}

但是在使用

之后
List<string> example1 = new List<string>();
example1.Add("Chris"); 
example1.Add("call");
example1.Add("Anna");

List<string> example2 = new List<string>();
example2.Add("Somebody");
example2.Add("call");
example2.Add("Wolf");

WordsModel words1 = new WordsModel();
WordsModel words2 = new WordsModel();
words1.Words = example1;
words2.Words = example2;

SentencedModel sentenced = new SentencedModel();
sentenced.Add(words1);
sentenced.Add(words2);

SerializeData serialize = new SerializeData();
var stringAsResult = serialize.SerializeToXml(sentenced);
Console.WriteLine(stringAsResult);

我遇到了错误。另外,我不知道如何将它们存储到CSV。 你可以帮帮我吗? 提前谢谢。

2 个答案:

答案 0 :(得分:2)

为了将数据保存为CSV,您可以使用以下提供此输出的方法:

Chris,call,Anna
Somebody,call,Wolf

每一行都是一个句子,然后所有的单词都用逗号分隔。

public string ToCsv(SentencedModel data)
{
    var csvLines = data.Select(x => String.Join(",", x));
    var csv = String.Join(Environment.NewLine, csvLines);
    return csv;
}

我仍然缺少XML部分,如果我这样做,我将编辑答案。 至少你有一部分。

修改请根据以下评论在ToCsv下面找到要转义的字段。

public string ToCsv(SentencedModel data)
{
    var csvLines = data.Sentences.Select(x => String.Join(",", x.Words.Select(w => EscapeForCsv(w))));
    var csv = String.Join(Environment.NewLine, csvLines);
    return csv;
}

private string EscapeForCsv(string input)
{
    return String.Format("\"{0}\"", input.Replace("\"", "\"\"\""));
}

答案 1 :(得分:0)

首先:如果你想要标记文本 - 我建议:

  1. 使用数组而不是列表。例如:string [] []。原因是:List会找到10%-20%的内存。您可以通过.ToArray()(例如example1.ToArray)将List转换为数组,或使用C#6.0语法:
  2. string[][] sentence = new [] { {"Chris","called","Anna"}, {"Somebody","called","Wolf"} };

    1. 如果可能:使用原始数据类型 - 类可以复杂化并减慢文本处理速度。
    2. 第二:如果你想实现自己的序列化器,试试这个approce:

      public abstract class AbstractSerializer
      {
        public abstract void Serialize(string[][] model, string path);
      }
      
      public class XmlSerializer : AbstractSerializer
      {
        public override void Serialize(string[][] model, string path)
        {
          // your stuff
        }
      }
      
      public class CsvSerializer : AbstractSerializer
      {
        public string LineSeparator { get; set; } = "\r\n";
        public string ValueSeparator { get; set; } = ";";
      
        public override void Serialize(string[][] model, string path)
        {
          var stb = new System.Text.StringBuilder();
          for (int i = 0; i < model.Length; i++)
          {
            for (int j = 0; j < model[i].Length; j++)
            {
              // Example output:
              // 0;0;Chris
              // 0;1;call
              // 0;2;Anna
              // 1;0;Somebody
              // 1;1;call
              // 1;2;Wolf
              stb.Append(string.Join(ValueSeparator, i, j, model[i][j], LineSeparator));
            }
          }
        }
      }