Question

我试图使用特定的数据匹配从多个文本文件中提取某些数据行。我有那部分工作（它匹配我拥有的字符串并拉回整行）。这就是我想要的，但我还需要在匹配之前发生的某一行数据（仅在匹配时）。我也有这个工作，但它不是100％正确。

我尝试使用-Context参数完成在匹配项上方拉线。它似乎有效，但在某些情况下，它将数据从多个匹配合并在一起，而不是在我的匹配上方拉线。以下是我正在搜索的其中一个文件的示例：

TRN*2*0000012016120500397~
STC*A3:0x9210019*20170103*U*18535********String of data here
STC*A3:0x810049*20170103*U*0********String of Data here
STC*A3:0x39393b5*20170103*U*0********String of data here
STC*A3:0x810048*20170103*U*0********String of data here
STC*A3:0x3938edc*20170103*U*0********String of data here
STC*A3:0x3938edd*20170103*U*0********String of data here
STC*A3:0x9210019*20170103*U*0********String of data here
TRN*2*0000012016120500874~
STC*A3:0x9210019*20170103*U*18535********String of data here
STC*A3:0x39393b5*20170103*U*0********String of data here
STC*A3:0x3938edc*20170103*U*0********String of data here
STC*A3:0x3938edd*20170103*U*0********String of data here
STC*A3:0x9210019*20170103*U*0********String of data here
TRN*2*0000012016120500128~
STC*A3:0x810049*20170103*U*0********String of Data here
STC*A3:0x39393b5*20170103*U*0********String of data here
STC*A3:0x810024*20170103*U*0********String of data here
STC*A3:0x9210019*20170103*U*0********String of data here
TRN*2*0000012016120500345~
STC*A3:0x9210019*20170103*U*18535********String of data here
STC*A3:0x810049*20170103*U*0********String of Data here
STC*A3:0x39393b5*20170103*U*0********String of data here
STC*A3:0x3938edc*20170103*U*0********String of data here
TRN*2*0000012016120500500~
STC*A3:0x810048*20170103*U*18535********String of data here
TRN*2*0000012016120500345~
STC*A3:0x810049*20170103*U*18535********String of data here

我只是在每个TRN*2下方的行中都有TRN*2和STC*A3:0x810024的情况下尝试拉STC*A3:0x810048行，但结果却不一致。

我有没有办法搜索TRN*2行并提取TRN*2及其下方包含STC*A3:0x810024和STC*A3:0x810048的行？如果TRN*2行下面的行不包含STC*A3:0x810024和STC*A3:0x810048，那么请不要提取任何内容。

到目前为止，这是我的代码：

$FilePath = "C:\Data\2017"
$files = Get-ChildItem -Path $FilePath -Recurse -Include *.277CA_UNWRAPPED
foreach ($file in $files) {
  (Get-Content $file) | 
    Select-String -Pattern "STC*A3:0x810024","STC*A3:0x810048" -SimpleMatch -Context 1,0 |
    Out-File -Append -Width 512 $FilePath\Output\test_results.txt
}

Answer 1

您的方法无效，因为您选择的行包含STC*A3:0x810024 或 STC*A3:0x810048以及前面的行。但是，前面的行不一定以TRN开头。即使他们这样做，该语句仍然会生成TRN行，其后跟任何的STC字符串，而不仅仅是后面的行 STC字符串。

您真正想要的是在以TRN开头的行之前拆分文件，然后检查每个片段是否包含STC个字符串。

(Get-Content $file | Out-String) -split '(?m)^(?=TRN\*2)' | Where-Object {
    $_.Contains('STC*A3:0x810024') -and
    $_.Contains('STC*A3:0x810048')
} | ForEach-Object {
    ($_ -split '\r?\n')[0]   # get just the 1st line from each fragment
} | Out-File -Append "$FilePath\Output\test_results.txt"

(?m)^(?=TRN\*2)是一个正则表达式，匹配行的开头，后跟字符串“TRN * 2”。 (?=...)是所谓的正向前瞻性断言。它确保在拆分字符串时不会删除“TRN * 2”。 (?m)是一个修饰符，它使^匹配多行字符串中行的开头，而不仅仅是字符串的开头。

如果后续行包含特定字符串，则从文本文件中提取匹配项

1 个答案: