我有一个足够普遍的问题,一个PowerShell正则表达式来读取多行记录。我已经阅读过提出类似问题的主题,但在我的案例中却无法解决问题。
我的文件包含可变长度的多行记录。我感兴趣的记录以01或02开头,后跟V或M.记录在另一个记录开始时或者以50' 50开头的批记录结束时结束。找到了。每行的前三个字符标识记录。
即 01V(记录开始 - 内容如下) 01
我试图通过识别开始和结束来读取带有正则表达式的单个记录。
我现在所拥有的是基于这个答案: Match everything between two words in Powershell
#Read the file as a single string
$FilePath = "blaablaablaa"
$TestFile = get-content $FilePath | Out-String
#( ?= Assert that this matches before the current position
# 0(1|2)(V|M) if the record is 01V or 01M or 02V or 02M
# ) End assertion
# .+? Match any number of characters, few as possible
# (?= Until a record starting with 70 is found
# ) End of look ahead
$regex = [regex] '(?is)(?<=0(1|2)(V|M)).+?(?=70)'
echo $TestFile | select-string -Pattern $regex
如果我将管道移除到out-sting并使用out-string管道返回整个文件,则上面将使用单行字符串。我猜测我没有正确处理/ n字符。
有什么建议吗?输入文件大致如下:
00日期
01Mxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 01 01 01 = 0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 = 5xxxxxxxxxxxxxxxxxxxxxxxxxxx
01Mxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 01 01 01 = 0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 = 9xxxxxxxxxxxxxxxxxxxxxxxxxxx
50 xxxxxxxxxxxxx xxxxxxxxxxxxxxxxx
01Vxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 $ 1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 $ A xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 $ B 0xxxxxxxxxxxxxxxxxxxx
01 $ 0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 $ 5xxxxxxxxxxxxxxxxxxxxxxxxxxx
50 xxxxxxxxxxxx BatchTotal
90 xxxxxxxxxxxx FILETotal
所需的输出是将文件拆分为单个记录,这些记录由&#39; 50&#39;分隔。或者&#39; 90&#39;或另一条记录的开头。例如,这是最终记录: -
01Vxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 $ 1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 $ A xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 $ B 0xxxxxxxxxxxxxxxxxxxx
01 $ 0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 $ 5xxxxxxxxxxxxxxxxxxxxxxxxxxx
答案 0 :(得分:1)
假设(根据您的说明),您还希望匹配01M
中的部分,直到下一个01M
,然后分别与50
匹配。(?gmis)^0[12][VM](?:[^\n]|\n(?!0[12][VM]|50|90))+
。这样就可以了:
(?:...)
说明:匹配0,1 或 2,V 或 M后,[^\n]|\n(?!0[12][VM]|50|90)
中的部分为:
(?!...)
这意味着:
匹配不新行的任何字符
或强> 的
新记录或 50 或 90的新行未遵循 {{1}}
<强> online Regex101 demo 强>
答案 1 :(得分:0)
使用您的测试数据:
@'
00 date
01Mxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 01 01 01=0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01=5xxxxxxxxxxxxxxxxxxxxxxxxxxx
01Mxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01 01 01 01=0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01=9xxxxxxxxxxxxxxxxxxxxxxxxxxx
50 xxxxxxxxxxxxx xxxxxxxxxxxxxxxxx
01Vxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01$1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01$A xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01$B 0xxxxxxxxxxxxxxxxxxxx
01$0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01$5xxxxxxxxxxxxxxxxxxxxxxxxxxx
50 xxxxxxxxxxxx BatchTotal
90 xxxxxxxxxxxx FILETotal
'@ | set-content testfile.txt
$Text = Get-Content ./testfile.txt -Raw
$regex = @'
(?ms)(01(?:M|V).+?)
(?:5|9)0.+?
'@
$Records =
[regex]::Matches($Text,$regex) |
foreach {$_.groups[1].value}
$Records[-1]
01Vxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01$1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01$A xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01$B 0xxxxxxxxxxxxxxxxxxxx
01$0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
01$5xxxxxxxxxxxxxxxxxxxxxxxxxxx