C#Regex - 从可重复的组中获取值

时间:2017-09-14 08:38:47

标签: c# regex-group getvalue

我有这个正则表达式模式,我试图找出一个句子(字符串)是否匹配它。

我的模式:

@"^A\s(?<TERM1>[A-Z][a-z]{1,})\sconsists\sof\s((?<MINIMUM1>(\d+))\sto\s(?<MAXIMUM1>(\d+|many){1})|(?<MINMAX1>(\d+|many{1}){1}){1})\s(?<TERM2>[A-Z][a-z]{1,})(\sand\s((?#********RepeatablePart********)(?<MININUM2>(\d+))\sto\s(?<MAXIMUM2>(\d+|many){1})|(?<MINMAX2>(\d+|many{1}){1}){1})\s(?<TERM3>([A-Z][a-z]{1,})))+\.$"

如何阅读我的模式:

A (TERM1) consists of (MINIMUM1 to (MAXIMUM1|many)|(MINMAX1|many)) (TERM2) ((?#********RepeatablePart********)and (MINIMUM2 to (MAXIMUM2|many)|(MINMAX|many)) (TERM3))+.

MINMAX1 / MINMAX2可以是一个数字,也可以只是单词&#39;很多&#39; MINIMUM1 / MINIMUM2是一个数字,MAXIMUM1 / MAXIMUM2可以是一个数字或多个单词&#39;很多&#39;。

示例句子:

  1. 一辆汽车包括2至5个座位,1个零星和1个Gaspedal以及4到6个Windows。
  2. 一棵树由许多苹果和2到多种颜色以及0到1只松鼠和许多叶子组成。
  3. 一本书由1到多位作者和1个标题和3个书签组成。

    1. 将包含:TERM1 = Car,MINIMUM1 = 2,MAXIMUM1 = 5,MINMAX1 = null,TERM2 = Seats,MINIMUM2 = null,MAXIMUM2 = null,MINMAX2 = 1,TERM3 = Breakpedal,MINIMUM2 = null,MAXIMUM2 = null, MINMAX2 = 1,TERM3 = Gaspedal,MINIMUM2 = 4,MAXIMUM2 = 6,MINMAX2 = null,TERM3 = Windows
    2. 将包含:TERM1 = Tree,MINIMUM1 = null,MAXIMUM1 = null,MINMAX1 = many,TERM2 = Apples,MINIMUM2 = 2,MAXIMUM2 = many,MINMAX2 = null,TERM3 = Colors,MINIMUM2 = 0,MAXIMUM2 = 1, MINMAX2 = null,TERM3 = Squirrel,MINIMUM2 = null,MAXIMUM2 = null,MINMAX2 = many,TERM3 = Leaves
    3. 将包含:TERM1 = Book,MINIMUM1 = 1,MAXIMUM1 = many,MINMAX1 = null,TERM2 = Authors,MINIMUM2 = null,MAXIMUM2 = null,MINMAX2 = 1,TERM3 = Title,MINIMUM2 = null,MAXIMUM2 = null, MINMAX2 = 3,TERM3 =书签
  4. 我创建了一个类,我想用我的字符串中的可重复部分的值填充(说到MINIMUM2,MAXIMUM2,MINMAX和TERM3):

    //MyObject contains the values of one expression from the repateatable part.
    public class MyObject
    {   
        public string term { get; set; }
        public string min { get; set; }
        public string max { get; set; }
        public string minmax { get; set; }
    }
    

    由于我的模式有一个可重复的部分(+),我想创建一个List,我在其中添加一个新对象(MyObject),我想填写可重复组的值。

    我的问题是我不确定如何使用可重复部分的值填充我的对象。我试图编码它的方式是错误的,因为我的列表不具有相同数量的值,因为a 句子(例如&#39;一本书由1到多个作者和1个标题和3个书签组成。&#39;。)在每个可重复的部分中从不有一个MINIMUM2,一个MAXIMUM2和一个MINMAX2。

    是否有更简单的方法来填充我的对象或如何从量词部分获取值?

    我的代码(在c#中):

    var match = Regex.Match(exampleText, pattern);
    if (match.Success)
    {
    
        string term1 = match.Groups["TERM1"].Value;
        string minimum1 = match.Groups["MINIMUM1"].Value;
        string maximum1 = match.Groups["MAXIMUM1"].Value;
        string minmax1 = match.Groups["MINMAX1"].Value;
        string term2 = match.Groups["TERM2"].Value;
    
        //--> Groups[].Captures..ToList() might be wrong. Maybe there is a better way to get the values of the reapeatable Part
        List<string> minimums2 = match.Groups["MINIMUM2"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>();
        List<string> maximums2 = match.Groups["MAXIMUM2"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>();
        List<string> minmaxs2 = match.Groups["MINMAX2"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>();
        List<string> terms3 = match.Groups["TERM3"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>();
    
        List<MyObject> myList = new List<MyObject>();
    
        for (int i = 0; i<terms3.Count; i++)
        {
           myList.Add(new MyObject()
              {
                 term = terms3[i],
                 min = minimums2[i] //-->ERROR MIGHT HAPPEN when List<string>minimums2 doesn't have the same amount of values like List<string> terms3
                 max = maximums2[i] //-->ERROR..
                 minmax = minmaxs2[i] //-->ERROR...
               });
         }
    }
    

1 个答案:

答案 0 :(得分:0)

我可以通过在'和'之后将我的exampleText分开来解决我的问题所以我有一个字符串'splittedText',其中包含我模式中可重复部分的每个短语。

string[] splittedText = Regex.Split(exampleText, @"\sand\s");

在拆分我的exampleText之后,我将每个单个短语的值插入到for循环中的myObject中,在那里我执行另一个regex.match以获取每个短语所需的值。

string pattern2 =(((?#********RepeatablePart********)(?<MININUM2>(\d+))\sto\s(?<MAXIMUM2>(\d+|many){1})|(?<MINMAX2>(\d+|many{1}){1}){1})\s(?<TERM3>([A-Z][a-z]{1,})))+\.$
List<MyObject> myList = new List<MyObject>();

//i = 1 -> since splittedText[0] contains the beginning of the sentence (e.g. 'A Car consists of 2 to 5 Seats')
for (int i = 1; i<splittedText.Count(); i++)
{                 
   var match2 = Regex.Match(splittedText[i], pattern2);
   if (match2.Success)
   {                      
       myList.Add(new MyObject()
       {
          term = match2.Groups["TERM3"].Value,              
          min = match2.Groups["MININUM2"].Value,
          max = match2.Groups["MAXIMUM2"].Value,
          minmax = match2.Groups["MINMAX2"].Value
        });

    }
 }