如何在c#中解析这个文本

时间:2010-08-05 10:06:02

标签: c# parsing

abc  = tamaz feeo maa roo key gaera porla
Xyz = gippaza eka jaguar ammaz te sanna.

我想制作一个结构

public struct word
{
 public string Word;
 public string Definition;
}

我如何解析它们并在c#中列出<word>

我如何在c#

中解析它

感谢您的帮助,但它是一个文字,并且不确定是否有一行或更多,所以我为换行做了什么

5 个答案:

答案 0 :(得分:4)

逐行读取输入并用等号分割。

class Entry
{
    private string term;
    private string definition;

    Entry(string term, string definition)
    {
        this.term = term;
        this.definition = definition;
    }
}

// ...

string[] data = line.Split('=');
string word = data[0].Trim();
string definition = data[1].Trim();

Entry entry = new Entry(word, definition);

答案 1 :(得分:2)

这也可以使用非常简单的LINQ查询来完成:

var definitions =
    from line in File.ReadAllLines(file)
    let parts = line.Split('=')
    select new word
        {
            Word = parts[0].Trim(),
            Definition = parts[1].Trim()
        }

答案 2 :(得分:1)

使用RegExp,您可以通过两种方式继续,具体取决于您的源输入


例1

假设您已经阅读了源并在向量或列表中保存了任何一行:

string[] input = { "abc  = tamaz feeo maa roo key gaera porla", "Xyz = gippaza eka jaguar ammaz te sanna." };

 Regex mySplit = new Regex("(\\w+)\\s*=\\s*((\\w+).*)");

 List<word> mylist = new List<word>();

 foreach (string wordDef in input)
 {
      Match myMatch = mySplit.Match(wordDef);

      word myWord;

      myWord.Word = myMatch.Groups[1].Captures[0].Value;
      myWord.Definition = myMatch.Groups[2].Captures[0].Value;

       mylist.Add(myWord);
 }

例2

假设您已在单个变量中读取了源(并且任何行以换行符'\ n'终止),您可以使用相同的正则表达式“(\ w +)\ s * = \ s *((\ w +)。*)“但就是这样

string inputs = "abc  = tamaz feeo maa roo, key gaera porla\r\nXyz = gippaza eka jaguar; ammaz: te sanna.";

MatchCollection myMatches = mySplit.Matches(inputs);

foreach (Match singleMatch in myMatches)
{

    word myWord;

    myWord.Word = singleMatch.Groups[1].Captures[0].Value;
    myWord.Definition = singleMatch.Groups[2].Captures[0].Value;

    mylist.Add(myWord);
}

与正则表达式匹配或不匹配的行“(\ w +)\ s = \ s *((\ w +)。”:< / strong>

  • “abc = tamaz feeo maa roo key gaera porla,qsdsdsqdqsd \ n” - &gt; 比赛!
  • “Xyz = gippaza eka jaguar ammaz te sanna。 sdq = sqds \ n” - &gt; 匹配!您也可以插入包含空格的说明。
  • “qsdqsd = \ nsdsdsd \ n” - &gt; 匹配多线对
  • “sdqsd = \ n” - &gt; 不匹配! (缺乏说明)
  • “= sdq sqdqsd。\ n” - &gt; 不匹配! (缺乏言论)

答案 3 :(得分:0)

使用正则表达式

答案 4 :(得分:0)

// Split at an = sign. Take at most two parts (word and definition); 
//    ignore any = signs in the definition
string[] parts = line.Split(new[] { '=' }, 2);

word w = new word();
w.Word = parts[0].Trim();

// If the definition is missing then parts.Length == 1
if (parts.Length == 1)
    w.Definition = string.Empty;
else
    w.Definition = parts[1].Trim();

words.Add(w);