固定位置和长度字段的正则表达式

时间:2012-09-24 05:11:03

标签: regex

我正在研究正则表达式,以固定宽度间距从主机日志行中提取精确字符(...正则表达式不是我的强项,BTW)。

我想提取“状态”字段的字段值,该字段是固定长度字段,其中包含示例事件中显示的SUCCESSFUL AUDITLOGON COMPLETEFINAL FAILED AUDIT值下方。

这个固定长度的字段有很多值,所以我无法像我想要的那样提取出文字字符串值。

相反,我想提取出从事件第54位开始的任何字符,长度恰好是18个字符

非常感谢有关正则表达式或方法等的任何帮助或想法。

528 LOGON   39690  SECURITY LAPTOP    8481 USER AB11 SUCCESSFUL AUDIT  BBB908AFB 06/20/12 09:11:43PM    
528 LOGON   39692  SECURITY LAPTOP    8495 USER AB11 LOGON COMPLETE    BBB908AFB 06/20/12 09:12:12PM    
528 LOGOFF  39699  SECURITY DESKTOP   4476 USER ABEQ FINAL FAILED AUDITAADAFCC01 06/20/12 09:55:49PM   

2 个答案:

答案 0 :(得分:1)

考虑以下PowerShell通用正则表达式的示例。

.{53}(.{18})

实施例

    $Matches = @()
    $String = '528 LOGON   39690  SECURITY LAPTOP    8481 USER AB11 SUCCESSFUL AUDIT  BBB908AFB 06/20/12 09:11:43PM    
528 LOGON   39692  SECURITY LAPTOP    8495 USER AB11 LOGON COMPLETE    BBB908AFB 06/20/12 09:12:12PM    
528 LOGOFF  39699  SECURITY DESKTOP   4476 USER ABEQ FINAL FAILED AUDITAADAFCC01 06/20/12 09:55:49PM
528 LOGON   39690  SECURITY LAPTOP    8481 USER AB11 REMEBER TO VOTE   BBB908AFB 06/20/12 09:11:43PM'
    Write-Host start with 
    write-host $String
    Write-Host
    Write-Host found
    ([regex]'.{53}(.{18})').matches($String) | foreach {
        write-host "key at $($_.Groups[1].Index) = '$($_.Groups[1].Value)'"
        } # next match

产量

start with
528 LOGON   39690  SECURITY LAPTOP    8481 USER AB11 SUCCESSFUL AUDIT  BBB908AFB 06/20/12 09:11:43PM    
528 LOGON   39692  SECURITY LAPTOP    8495 USER AB11 LOGON COMPLETE    BBB908AFB 06/20/12 09:12:12PM    
528 LOGOFF  39699  SECURITY DESKTOP   4476 USER ABEQ FINAL FAILED AUDITAADAFCC01 06/20/12 09:55:49PM
528 LOGON   39690  SECURITY LAPTOP    8481 USER AB11 REMEBER TO VOTE   BBB908AFB 06/20/12 09:11:43PM

found
key at 53 = 'SUCCESSFUL AUDIT  '
key at 159 = 'LOGON COMPLETE    '
key at 265 = 'FINAL FAILED AUDIT'
key at 367 = 'REMEBER TO VOTE   '

摘要

  • .{53}跳过前53个字符(注意字符串中的第一个位置为零)
  • (.{18})找到并返回18个字符宽的字段

答案 1 :(得分:0)

  1. 正则表达式是解决这个简单问题的复杂解决方案。
  2. 由于输入格式和所需输出的偏移是固定的,只需逐行读取输入,然后 做一些小的字符串处理。
  3. 如果仍然需要正则表达式,这是一个起点(虽然你自己的输入数字54和18与你想要的答案不匹配,所以我尝试用47和16代替,你可以修改为你想要的):

    (?< = [a-zA-Z0-9] {47})([a-zA-Z0-9] {16})

    http://regexr.com?328dm