Get second occurrence from second row

时间:2019-03-19 15:15:02

标签: regex find-occurrences

I have the following tables from a dotnet test command result, and what I am trying to achieve is getting the second occurrence (the ones below Branch) of lines that start with Average.

+---------+-----------+-----------+-----------+
|         | Line      | Branch    | Method    |
+---------+-----------+-----------+-----------+
| Total   | 100%      | 100%      | 100%      |
+---------+-----------+-----------+-----------+
| Average | 100%      | 100%      | 100%      | 
+---------+-----------+-----------+-----------+

+---------+-----------+-----------+-----------+
|         | Line      | Branch    | Method    |
+---------+-----------+-----------+-----------+
| Total   | 100%      | 100%      | 100%      |
+---------+-----------+-----------+-----------+
| Average | 100%      | 100%      | 100%      | 
+---------+-----------+-----------+-----------+

I have managed writing the following regex ^\| Average *\| (\d+.\d+\%).*$ but adding {2} anywhere inside the expression still doesn't return me the second occurrence. Also, I've tried using https://regex101.com/ but the match information that it shows is the following:

Regex101.com Match Information

From my understanding I need to get the second group but I think I need a hint or a little bit of help to reach my goal.

Any help? Thanks in advance!

3 个答案:

答案 0 :(得分:1)

那呢:

string table =
    "+---------+-----------+-----------+-----------+" + Environment.NewLine +
    "|         | Line      | Branch    | Method    |" + Environment.NewLine +
    "+---------+-----------+-----------+-----------+" + Environment.NewLine +
    "| Total   | 100%      | 100%      | 100%      |" + Environment.NewLine +
    "+---------+-----------+-----------+-----------+" + Environment.NewLine +
    "| Average | 100%      |  89%      | 100%      |" + Environment.NewLine +
    "+---------+-----------+-----------+-----------+" + Environment.NewLine +
    "" + Environment.NewLine +
    "+---------+-----------+-----------+-----------+" + Environment.NewLine +
    "|         | Line      | Branch    | Method    |" + Environment.NewLine +
    "+---------+-----------+-----------+-----------+" + Environment.NewLine +
    "| Total   | 100%      | 100%      | 100%      |" + Environment.NewLine +
    "+---------+-----------+-----------+-----------+" + Environment.NewLine +
    "| Average | 100%      | 99%       | 100%      |" + Environment.NewLine +
    "+---------+-----------+-----------+-----------+";

MatchCollection matches = Regex.Matches(table, @"(?<=\| Average *\| \d+\% +\| *)\d+\%(?=.*)");

foreach (Match m in matches)
{
    Console.WriteLine(m.Value);
}

输出:

89%
99%

更新:

我不得不发现.NET(我在其中构建了RegEx的地方)在环视表达式中支持量词,而其他RegEx实现则缺少此支持。

因此,我的解决方案的RegEx表达式无法在其中运行。

为解决此问题,我删除了量词,并用固定的字符声明替换了它们。这适用于固定表,但是如果表的布局在其宽度上是动态的,则将不起作用:

(?<=\| Average \| ..\d\%      \| )\d+\%(?=.*)

答案 1 :(得分:0)

我看到的一个解决方案是拥有一个正则表达式,它将捕获多行,从第一个“ Average”开始到第二个结束。至于正则表达式中包含的所有逻辑,那么您需要知道如何在正则表达式中指定搜索选项,这通常使用/sm完成。最后,您的正则表达式将如下所示:

/^\| Average *\| \d*.\d+\%.*$.*^\| Average *\| (\d*.\d+\%).*$/sm

捕获的组仅包含Average的{​​{1}}百分比的第二次出现。

答案 2 :(得分:0)

最终,我通过反复试验得到了答案。

\| Average \| .*\d+\% +\| *(\d*.\d\%) +\| +\d

将对“分支”下面的列进行数学运算。谢谢大家的帮助!