正则表达式减慢了程序

时间:2014-11-11 22:10:25

标签: c# regex parsing

我正在尝试创建一个解析游戏聊天记录中数据的程序。到目前为止,我已经设法让程序工作并解析我想要的数据,但问题是程序变慢了。

目前解析一个10MB的文本文件需要5秒钟,如果我将RegexOptions.Compiled添加到我的正则表达式中,我注意到它会下降到3秒。

我相信我已经在我的正则表达式匹配中找到了问题。由于5个正则表达式,当前一行被读取5次,因此当我稍后添加更多时,程序会变得更慢。

我该怎么做才能使我的程序不会因多个正则表达式而变慢?所有使代码更好的建议都值得赞赏!

if (sender.Equals(ButtonParse))
        {
            var totalShots = 0f;
            var totalHits = 0f;
            var misses = 0;
            var crits = 0;

            var regDmg = new Regex(@"(?<=\bSystem\b.* You inflicted )\d+.\d", RegexOptions.Compiled);
            var regMiss = new Regex(@"(?<=\bSystem\b.* Target evaded attack)", RegexOptions.Compiled);
            var regCrit = new Regex(@"(?<=\bSystem\b.* Critical hit - additional damage)", RegexOptions.Compiled);
            var regHeal = new Regex(@"(?<=\bSystem\b.* You healed yourself )\d+.\d", RegexOptions.Compiled);
            var regDmgrec = new Regex(@"(?<=\bSystem\b.* You take )\d+.\d", RegexOptions.Compiled);

            var dmgList = new List<float>(); //New list for damage values
            var healList = new List<float>(); //New list for heal values
            var dmgRecList = new List<float>(); //New list for damage received values

            using (var sr = new StreamReader(TextBox1.Text))
            {
                while (!sr.EndOfStream)
                {
                    var line = sr.ReadLine();

                    var match = regDmg.Match(line);
                    var match2 = regMiss.Match(line);
                    var match3 = regCrit.Match(line);
                    var match4 = regHeal.Match(line);
                    var match5 = regDmgrec.Match(line);

                    if (match.Success)
                    {
                        dmgList.Add(float.Parse(match.Value, CultureInfo.InvariantCulture));
                        totalShots++;
                        totalHits++;
                    }
                    if (match2.Success)
                    {
                        misses++;
                        totalShots++;
                    }
                    if (match3.Success)
                    {
                        crits++;
                    }
                    if (match4.Success)
                    {
                        healList.Add(float.Parse(match4.Value, CultureInfo.InvariantCulture));
                    }
                    if (match5.Success)
                    {
                        dmgRecList.Add(float.Parse(match5.Value, CultureInfo.InvariantCulture));
                    }
                }
                TextBlockTotalShots.Text = totalShots.ToString(); //Show total shots
                TextBlockTotalDmg.Text = dmgList.Sum().ToString("0.##"); //Show total damage inflicted

                TextBlockTotalHits.Text = totalHits.ToString(); //Show total hits
                var hitChance = totalHits / totalShots; //Calculate hit chance
                TextBlockHitChance.Text = hitChance.ToString("P"); //Show hit chance

                TextBlockTotalMiss.Text = misses.ToString(); //Show total misses
                var missChance = misses / totalShots; //Calculate miss chance
                TextBlockMissChance.Text = missChance.ToString("P"); //Show miss chance

                TextBlockTotalCrits.Text = crits.ToString(); //Show total crits
                var critChance = crits / totalShots; //Calculate crit chance
                TextBlockCritChance.Text = critChance.ToString("P"); //Show crit chance

                TextBlockDmgHealed.Text = healList.Sum().ToString("F1"); //Show damage healed

                TextBlockDmgReceived.Text = dmgRecList.Sum().ToString("F1"); //Show damage received

                var pedSpent = dmgList.Sum() / (float.Parse(TextBoxEco.Text, CultureInfo.InvariantCulture) * 100); //Calculate ped spent
                TextBlockPedSpent.Text = pedSpent.ToString("0.##") + " PED"; //Estimated ped spent
            }
        }

这是一个示例文本:

2014-09-02 23:07:22 [System] [] You inflicted 45.2 points of damage.
2014-09-02 23:07:23 [System] [] You inflicted 45.4 points of damage.
2014-09-02 23:07:24 [System] [] Target evaded attack.
2014-09-02 23:07:25 [System] [] You inflicted 48.4 points of damage.
2014-09-02 23:07:26 [System] [] You inflicted 48.6 points of damage.
2014-10-15 12:39:55 [System] [] Target evaded attack.
2014-10-15 12:39:58 [System] [] You inflicted 56.0 points of damage.
2014-10-15 12:39:59 [System] [] You inflicted 74.6 points of damage.
2014-10-15 12:40:02 [System] [] You inflicted 78.6 points of damage.
2014-10-15 12:40:04 [System] [] Target evaded attack.
2014-10-15 12:40:06 [System] [] You inflicted 66.9 points of damage.
2014-10-15 12:40:08 [System] [] You inflicted 76.2 points of damage.
2014-10-15 12:40:12 [System] [] You take 18.4 points of damage.
2014-10-15 12:40:14 [System] [] You inflicted 76.1 points of damage.
2014-10-15 12:40:17 [System] [] You inflicted 88.5 points of damage.
2014-10-15 12:40:19 [System] [] You inflicted 69.0 points of damage.
2014-10-19 05:56:30 [System] [] Critical hit - additional damage! You inflict 275.4 points of damage.
2014-10-19 05:59:29 [System] [] You inflicted 92.8 points of damage.
2014-10-19 05:59:31 [System] [] Critical hit - additional damage! You inflict 251.5 points of damage.
2014-10-19 05:59:35 [System] [] You take 59.4 points of damage.
2014-10-19 05:59:39 [System] [] You healed yourself 84.0 points.

1 个答案:

答案 0 :(得分:5)

以下是我看到的问题

  1. 正如评论中所建议的那样,对于基本模式情况,没有正则表达式解析器工作方式太多。
  2. 为什么要在同一文本上多次解析数据?创建一个正则表达式模式,通过每行扫描完成所有工作。
  3. 在WPF中,不要让GUI线程继续工作,在后台任务中完成工作并更新viewmodel(你正在使用MVVM吗?),它会使用INotifyPropertyChanged事件将信息传播到屏幕。
  4. 以下是一种一行一致的正则表达式模式解决方案。它的第一个任务是验证线路上是否包含[System]。如果不是,则该行没有匹配。如果它确实有系统,那么它会查找特定关键字和可能的值,并将它们置于键/值对情境中的正则表达式named match captures中。

    一旦使用linq完成,它将总结找到的值。请注意,我已经对模式进行了评论,并且正则表达式解析器忽略了它。

    string pattern = @"^       # Beginning of line to anchor it.
    (?=.+\[System\])           # Within the line a literal '[System]' has to occur
    (?=.+                      # Somewhere within that line search for these keywords:
      (?<Action>               # Named Match Capture Group 'Action' will hold a keyword.
              inflicte?d?      # if the line has inflict or inflicted put it into 'Action'
              |                # or
              evaded           # evaded
              | take           # or take
              | yourself       # or yourself (heal)
       )
      (\s(?<Value>[\d.]+))?)   # if a value of points exist place into 'Value'
    .+                         # match one or more to complete it.
    $                          #end of line to stop on";
    
     // IgnorePatternWhiteSpace only allows us to comment the pattern. Does not affect processing.
    var tokens =
       Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline)
            .OfType<Match>()
            .Select( mt => new {
                                Action = mt.Groups["Action"].Value,
                                Value  = mt.Groups["Value"].Success ? double.Parse(mt.Groups["Value"].Value) : 0,
                                Count  = 1,
                               })
             .GroupBy ( itm => itm.Action,  // Each action will be grouped into its name for summing
                        itm => itm,   // This is value to summed amongst the individual items of the group.
                        (action, values) => new
                                {
                                    Action = action,
                                    Count  = values.Sum (itm => itm.Count),
                                    Total  = values.Sum(itm => itm.Value)
                                 }
                             );
    

    结果

    linq结果将每个标记作为一个实体返回,该实体总结了操作的所有值,但也计算了这些操作发生的次数。

    enter image description here

    DATA

    string data=@"2014-09-02 23:07:22 [System] [] You inflicted 45.2 points of damage.
    2014-09-02 23:07:23 [System] [] You inflicted 45.4 points of damage.
    2014-09-02 23:07:24 [System] [] Target evaded attack.
    2014-09-02 23:07:25 [System] [] You inflicted 48.4 points of damage.
    2014-09-02 23:07:26 [System] [] You inflicted 48.6 points of damage.
    2014-10-15 12:39:55 [System] [] Target evaded attack.
    2014-10-15 12:39:58 [System] [] You inflicted 56.0 points of damage.
    2014-10-15 12:39:59 [System] [] You inflicted 74.6 points of damage.
    2014-10-15 12:40:02 [System] [] You inflicted 78.6 points of damage.
    2014-10-15 12:40:04 [System] [] Target evaded attack.
    2014-10-15 12:40:06 [System] [] You inflicted 66.9 points of damage.
    2014-10-15 12:40:08 [System] [] You inflicted 76.2 points of damage.
    2014-10-15 12:40:12 [System] [] You take 18.4 points of damage.
    2014-10-15 12:40:14 [System] [] You inflicted 76.1 points of damage.
    2014-10-15 12:40:17 [System] [] You inflicted 88.5 points of damage.
    2014-10-15 12:40:19 [System] [] You inflicted 69.0 points of damage.
    2014-10-19 05:56:30 [System] [] Critical hit - additional damage! You inflict 275.4 points of damage.
    2014-10-19 05:59:29 [System] [] You inflicted 92.8 points of damage.
    2014-10-19 05:59:31 [System] [] Critical hit - additional damage! You inflict 251.5 points of damage.
    2014-10-19 05:59:35 [System] [] You take 59.4 points of damage.
    2014-10-19 05:59:39 [System] [] You healed yourself 84.0 points.";