如何使Regex匹配更有效?

时间:2019-06-20 15:50:43

标签: c# .net regex

我正在用Regex解析大型日志文件以提取相关数据。我的解析器的运行时间随着文件的大小呈指数增长。当我使用Visual Studio探查器时,它向我显示大部分时间都花在了Regex.Match函数上。如何使我的Regex模式更有效?有没有比Regex更有效的替代方法?

我已经尝试过在匹配行之前从行中删除空格,因此我不花费计算来匹配空格,但这没有任何改善。

这是我目前与行匹配的模式:

cmdStartPattern = new Regex(@"(\d+.\d+):\s+ufshcd_command:\s+(\w+).ufshc:\s+(\w+)_send:\s+tag:\s+(\d+)\s+cmd:\s+(\w+)\s+lba:\s+(\d+)\s+size:\s+(\d+)\s+DB:\s+(\w+)", RegexOptions.Compiled);
cmdDonePattern = new Regex(@"(\d+.\d+):\s+ufshcd_command:\s+(\w+).ufshc:\s+(\w+)_cmpl_*\d*:\s+tag:\s+(\d+)\s+cmd:\s+(\w+)\s+lba:\s+(\d+)\s+size:\s+(\d+)", RegexOptions.Compiled);
cmdBlockPattern = new Regex(@"(\d+.\d+):\s+block_rq_issue:\s+(\d+),(\d+)\s+(\w+)\s+\d+\s+\((.*)\)\s+(\d+)\s+\+\s+(\d+)", RegexOptions.Compiled);
getCurrTimePattern = new Regex(@"(\d+.\d+):", RegexOptions.Compiled);

这些是我试图从中提取数据的日志文件中的几行示例:

          <idle>-0     [001] d.h2 228795.291923: ufshcd_command: 1d84000.ufshc:      scsi_cmpl: tag: 0  cmd: 0x2a lba: 19733048  size: 4096    DB: 0x0        IS: 0x0
          <idle>-0     [001] d.h2 228795.291928: ufshcd_clk_gating: 1d84000.ufshc: state changed to REQ_CLKS_OFF
          <idle>-0     [001] ..s1 228795.291950: block_rq_complete: 8,0 WAS () 19733048 + 8 [0]
              sh-7199  [002] d..1 228795.318053: block_rq_issue: 8,0 RA 0 () 19692680 + 8 [sh]
              sh-7199  [002] d..1 228795.318088: ufshcd_clk_gating: 1d84000.ufshc: state changed to CLKS_ON
              sh-7199  [002] d..1 228795.318149: ufshcd_command: 1d84000.ufshc:      scsi_send: tag: 0  cmd: 0x28 lba: 19692680  size: 4096    DB: 0x1        IS: 0x0
          <idle>-0     [001] d.h2 228795.318822: ufshcd_command: 1d84000.ufshc:      scsi_cmpl: tag: 0  cmd: 0x28 lba: 19692680  size: 4096    DB: 0x0        IS: 0x0
          <idle>-0     [001] d.h2 228795.318836: ufshcd_clk_gating: 1d84000.ufshc: state changed to REQ_CLKS_OFF 

如您所见,有些行是我不需要的,并且小数点前的每一行的开头都是可变的,因此我不能将其包括在Regex模式中。

0 个答案:

没有答案