我正在尝试使用Powershell中的Regex从实时日志文件中提取数字。我的正则表达式代码的工作原理是它只返回字母A左边的数字,但由于某种原因它返回整行而不是孤立的数字。
我正在尝试转换日志文件:
1/11/2016 3:26:12 PM 1/11/2016 3:27:00 PM 86.4 A 1/11/2016 3:26:12 PM 1/11/2016 3:28:00 PM 86.3 A 1/11/2016 3:26:12 PM 1/11/2016 3:29:00 PM 86.8 A 1/11/2016 3:26:12 PM 1/11/2016 3:29:16 PM 86.7 A
致:
86.4 86.3 86.8 86.7
到目前为止,这是我的代码:
border: 0;
clip: rect(0 0 0 0);
height: 1px;
margin: -1px;
overflow: hidden;
padding: 0;
position: absolute;
width: 1px;
答案 0 :(得分:1)
正则表达式本身有点古怪.*\d\s+A
意味着:“任何事情都会发生,然后是一个数字,然后是至少一个空格,最后是字母A”。这涵盖了比您感兴趣的更多案例。例如,它将匹配仅包含四个字符的行,例如“94.9 A”。
根据日志文件结构和误报,更严格的方法和/或分组是有帮助的。像这样,(?:PM\s+)(\d+\.\d+)(?:\s+A)
(?:PM\s+) := match letters PM followed with at least one whitespace
(\d+\.\d+) := match at least one digit followed by dot and at least one digit
(?:\s+A) := match at least one whitespace followed by letter A
举个例子,
[regex]$regex = '(?:PM\s+)(\d+\.\d+)(?:\s+A)'
$s = @("1/11/2016 3:26:12 PM 1/11/2016 3:27:00 PM 86.4 A",
"1/11/2016 3:26:12 PM 1/11/2016 3:28:00 PM 86.3 A",
"1/11/2016 3:26:12 PM 1/11/2016 3:29:00 PM 86.8 A",
"1/11/2016 3:26:12 PM 1/11/2016 3:29:16 PM 86.7 A",
"foobarline shouldn't match",
"94.9 A",
"PM 84.8 A")
# Note that the two invalid rows are skipped
$s | % { $regex.Matches($_) | % {$_.groups[1].value} }
86.4
86.3
86.8
86.7
84.8