C#如何在一个单一的情况下组合多个正则表达式模式,而无需关心订单

时间:2015-09-01 21:13:19

标签: c# .net regex

这是需要解析的文本示例:

============================================================================================================================================================
line table (detailed)
============================================================================================================================================================

------------------------------------------------------------------------------------------------------------------------------------------------------------
line
------------------------------------------------------------------------------------------------------------------------------------------------------------
              if-index : 1/1/4/1                            rel-cap-occ-up : 60                                noise-margin-up : 165
     output-power-down : 92                             sig-attenuation-up : 32                            loop-attenuation-up : 30
         actual-opmode : g993-2-8d                            xtu-c-opmode : 00:00:00:00:00:00:00:00:48:00:00:00:00:00:00:00
            ansi-t1413 : dis-ansi-t1413                           etsi-dts : dis-etsi-dts                             g992-1-a : dis-g992-1-a
              g992-1-b : dis-g992-1-b                             g992-2-a : dis-g992-2-a                             g992-3-a : dis-g992-3-a
              g992-3-b : dis-g992-3-b                            g992-3-aj : dis-g992-3-aj                           g992-3-l1 : dis-g992-3-l1
             g992-3-l2 : dis-g992-3-l2                           g992-3-am : dis-g992-3-am                            g992-5-a : dis-g992-5-a
              g992-5-b : dis-g992-5-b                          ansi-t1.424 : dis-ansi-t1.424                           etsi-ts : dis-etsi-ts
            itu-g993-1 : dis-itu-g993-1                       ieee-802.3ah : dis-ieee-802.3ah                        g992-5-aj : dis-g992-5-aj
             g992-5-am : dis-g992-5-am                           g993-2-8a : dis-g993-2-8a                           g993-2-8b : dis-g993-2-8b
             g993-2-8c : dis-g993-2-8c                           g993-2-8d : g993-2-8d                              g993-2-12a : dis-g993-2-12a
            g993-2-12b : dis-g993-2-12b                         g993-2-17a : g993-2-17a                             g993-2-30a : dis-g993-2-30a
       actual-psd-down : -586                             power-mgnt-state : l0
     per-bnd-lp-att-up : 00:08:00:21:04:f6:04:f6:04:f6
     pr-bnd-sgn-att-up : 04:f6:00:21:04:f6:04:f6:04:f6
     pr-bnd-nois-mg-up : 02:76:00:a5:02:76:02:76:02:76                                                            high-freq-up : 5197
          elect-length : 3                                   time-adv-corr : -902                           actual-tps-tc-mode : ptm
     actual-ra-mode-up : automatic                           vect-cpe-type : legacy
============================================================================================================================================================

作为一个例子,我有以下模式来获取三个变量的值:

const string pattern1 = @"itu-g993-1[\s]{0,1}:[\s]{0,1}(?<itu_g993_1>.*?(?=\s))";
const string pattern2 = @"time-adv-corr[\s]{0,1}:[\s]{0,1}(?<time_adv_corr>.*?(?=\s))";
const string pattern3 = @"xtu-c-opmode[\s]{0,1}:[\s]{0,1}(?<xtu_c_opmode>.*?(?=\s))";

他们自己工作得很好。

我的问题是:

  1. 如何在一次调用Regex.Match中将这三种模式结合起来,这样我就能得到所有三种结果?
  2. 通过一次调用或多次调用Regex.Match方法执行此操作是否存在性能优势或劣势?
  3. 我之所以提出这个问题的原因是我们的要求仍然模糊不清,而且我们不确切知道我们需要提取哪些变量和多少变量。

2 个答案:

答案 0 :(得分:1)

Because, it just finds it as it goes left to right,
you could just join them together using an alternation.

edit: If the regex's are dependent on a case by case basis,
you could always make a function that creates the full regex by
joining the individual ones (with alternation) based on a passed in bitmask.
This way you have a central place to store and manage all the individual regex.

string Lines =
@"
              if-index : 1/1/4/1                            rel-cap-occ-up : 60                                noise-margin-up : 165
     output-power-down : 92                             sig-attenuation-up : 32                            loop-attenuation-up : 30
         actual-opmode : g993-2-8d                            xtu-c-opmode : 00:00:00:00:00:00:00:00:48:00:00:00:00:00:00:00
            ansi-t1413 : dis-ansi-t1413                           etsi-dts : dis-etsi-dts                             g992-1-a : dis-g992-1-a
              g992-1-b : dis-g992-1-b                             g992-2-a : dis-g992-2-a                             g992-3-a : dis-g992-3-a
              g992-3-b : dis-g992-3-b                            g992-3-aj : dis-g992-3-aj                           g992-3-l1 : dis-g992-3-l1
             g992-3-l2 : dis-g992-3-l2                           g992-3-am : dis-g992-3-am                            g992-5-a : dis-g992-5-a
              g992-5-b : dis-g992-5-b                          ansi-t1.424 : dis-ansi-t1.424                           etsi-ts : dis-etsi-ts
            itu-g993-1 : dis-itu-g993-1                       ieee-802.3ah : dis-ieee-802.3ah                        g992-5-aj : dis-g992-5-aj
             g992-5-am : dis-g992-5-am                           g993-2-8a : dis-g993-2-8a                           g993-2-8b : dis-g993-2-8b
             g993-2-8c : dis-g993-2-8c                           g993-2-8d : g993-2-8d                              g993-2-12a : dis-g993-2-12a
            g993-2-12b : dis-g993-2-12b                         g993-2-17a : g993-2-17a                             g993-2-30a : dis-g993-2-30a
       actual-psd-down : -586                             power-mgnt-state : l0
     per-bnd-lp-att-up : 00:08:00:21:04:f6:04:f6:04:f6
     pr-bnd-sgn-att-up : 04:f6:00:21:04:f6:04:f6:04:f6
     pr-bnd-nois-mg-up : 02:76:00:a5:02:76:02:76:02:76                                                            high-freq-up : 5197
          elect-length : 3                                   time-adv-corr : -902                           actual-tps-tc-mode : ptm
     actual-ra-mode-up : automatic                           vect-cpe-type : legacy
";
Regex RxData = new Regex(
              @"
                  itu-g993-1[\s]{0,1}:[\s]{0,1}(?<itu_g993_1>.*?(?=\s))
                | time-adv-corr[\s]{0,1}:[\s]{0,1}(?<time_adv_corr>.*?(?=\s))
                | xtu-c-opmode[\s]{0,1}:[\s]{0,1}(?<xtu_c_opmode>.*?(?=\s))
              ", RegexOptions.IgnorePatternWhitespace );

Match _mData = RxData.Match( Lines );
while (_mData.Success)
{
    if (_mData.Groups["itu_g993_1"].Success )
        Console.WriteLine("itu_g993_1 =  {0} \r\n", _mData.Groups["itu_g993_1"].Value);
    if (_mData.Groups["time_adv_corr"].Success)
        Console.WriteLine("time_adv_corr =  {0} \r\n", _mData.Groups["time_adv_corr"].Value);
    if (_mData.Groups["xtu_c_opmode"].Success)
        Console.WriteLine("xtu_c_opmode =  {0} \r\n", _mData.Groups["xtu_c_opmode"].Value);

    _mData = _mData.NextMatch();
}

Output

xtu_c_opmode =  00:00:00:00:00:00:00:00:48:00:00:00:00:00:00:00

itu_g993_1 =  dis-itu-g993-1

time_adv_corr =  -902

答案 1 :(得分:1)

如果这些值始终存在,您可以在正向预测和以下正则表达式中使用捕获组

(?s)^(?=.*itu-g993-1\s?:\s?(?<itu>\S*))(?=.*time-adv-corr\s?:\s?(?<time>\S*))(?=.*xtu-c-opmode\s?:\s?(?<xtu>\S*))

您可以在regexstorm.net进行测试。

即使环顾四周不消耗文本,文本本身也可以被捕获到组中,这在我们不需要匹配的情况下很有用,但只是一段文本。

请注意,正面预测需要模式以匹配子字符串,因此如果没有xtu-c-opmode,但itu-g993-1time-adv-corr是,那么将不会匹配,也没有被捕获的组。