Question

我在日志文件中有以下数据，我想提取在＆＃34;处理开始＆＃34;的两个短语之间的行。和＆＃34;流程已完成＆＃34;包括线的开始和线的结束。

2016-11-28 12:18:59.5286 | 14 | Info | Process Started -ABC *****
....
..
2016-11-28 12:18:59.5286 | 14 | Info | Process Completed -ABC, Status: Failed***



2016-11-28 13:18:59.5286 | 14 | Info | Process Started -DEF
....
..
2016-11-28 13:18:59.5286 | 14 | Info | Process Completed -DEF Status: Passed***

使用以下RegEx我能够提取行，但缺少给定匹配的行的开头和结尾。

Regex r = new Regex("^*?Process Started -"+process.Name+"(.*?)Process Completed: "+process.Name+".*?", RegexOptions.Singleline);

以上正则表达式返回

Process Started -ABC *****
....
..
2016-11-28 12:18:59.5286 | 14 | Info | Process Completed

但我需要这样

2016-11-28 12:18:59.5286 | 14 | Info | Process Started -ABC *****
....
..
2016-11-28 12:18:59.5286 | 14 | Info | Process Completed -ABC, Status: Failed***

Answer 1

你很接近，但最后的懒惰量词就是问题：它将匹配最少的东西，在这种情况下什么都不是。

以下是正则表达式的修订版：

Regex r = new Regex("[^\n]*?Process Started -"
        + process.Name + "(.*?)Process Completed -"
        + process.Name + "[^\n]*", RegexOptions.Singleline);

我所做的改变：

在“处理完成”
最重要的是： [^\n]*在开头和结尾都会阻止匹配换行符，但会获得其余部分

额外信息：

我不确定您打算如何在代码的上下文中使用它，但如果您需要提取所有这些部分，而不是一个特定的进程名称，您可以使用此变体一次性获取所有这些部分：

Regex r = new Regex("[^\n]*?Process Started -(\w+)(.*?)Process Completed -\1[^\n]*", RegexOptions.Singleline);

\1是对(\w+)匹配的任何进程名称的反向引用。最终会得到一组匹配项，每个过程名称都有一个。

Answer 2

您需要使用Multiline选项，然后您可以执行以下操作：

var reg = new Regex(@"^.*Process Started -ABC(.*)$(\n^.*$)*?\n(^.*Process Completed -ABC.*)$", 
                    RegexOptions.Multiline);

但它有点难看。正如@ blaze_125在评论中建议的那样，你最好的选择是分成行并迭代寻找Started和Completed字符串，然后抓住其间的所有行

您可以执行以下操作：

var lines = str.Split('\n');

var q = new Queue<string>();

foreach (var l in lines)
{
    q.Enqueue(l);
    if (l.Contains("Process Completed"))   // you could use a regex here if you want more
                                           // complex matching
    {
        string output;
        while (q.Count > 0)
        {
            // your queue here would contain exactly one entry
            output = q.Dequeue();
            Console.WriteLine(output);
        }
    }
}

RegEx用于在C＃中提取2个字符串之间的行

2 个答案: