Powershell文本搜索 - 多个匹配

时间:2018-03-28 17:52:08

标签: powershell

我有一组.txt文件,其中包含以下一个或两个字符串。

"red", "blue", "green", "orange", "purple", ....列表中有更多(50+)种可能性。

如果有帮助,我可以判断.txt文件是否包含一个或两个项目,但不知道它们是哪一个。字符串模式总是在它们自己的行上。

我希望脚本能够具体告诉我它找到哪一个或两个字符串匹配(来自主列表),以及它找到它们的顺序。 (哪一个是第一个)

由于我要搜索大量文本文件,因此我想在搜索时将输出结果写入CSV文件。

FILENAME1,first_match,second_match

file1.txt,blue,red
file2.txt,red, blue
file3.txt,orange,
file4.txt,purple,red
file5.txt,purple,
...

我尝试使用许多单独的Select-Strings返回布尔结果来设置找到任何匹配项的变量,但是随着可能的字符串数量,它变得非常快。我对这个问题的搜索结果没有给我提供新的想法。 (我确定我没有以正确的方式提问)

我是否需要遍历每个文件中的每一行文字?

我是否通过检查每个搜索字符串的存在来坚持消除方法的过程?

我正在寻找一种更优雅的方法解决这个问题。 (如果存在的话)

2 个答案:

答案 0 :(得分:3)

不是很直观但很优雅......

关注切换语句

$regex = "(purple|blue|red)"

Get-ChildItem $env:TEMP\test\*.txt | Foreach-Object{
    $result = $_.FullName
    switch -Regex -File $_
    {
        $regex {$result = "$($result),$($matches[1])"}
    }
    $result
}

返回

C:\Users\Lieven Keersmaekers\AppData\Local\Temp\test\file1.txt,blue,red
C:\Users\Lieven Keersmaekers\AppData\Local\Temp\test\file2.txt,red,blue

其中

  • file1首先包含blue,然后是red
  • file2首先包含red,然后是blue

答案 1 :(得分:1)

你可以使用正则表达式进行搜索以获得索引(startpos。在线)与Select-String结合使用,返回亚麻,你就可以了。

Select-String支持将数组作为-Pattern的值,但遗憾的是,即使您使用-AllMatches(bug?),它也会在第一次匹配后停在一行上。因此,我们必须每个单词/模式搜索一次。尝试:

#List of words. Had to escape them because Select-String doesn't return Matches-objects (with Index/location) for SimpleMatch
$words = "purple","blue","red" | ForEach-Object { [regex]::Escape($_) }
#Can also use a list with word/sentence per line using $words = Get-Content patterns.txt | % { [regex]::Escape($_.Trim()) }

#Get all files to search
Get-ChildItem -Filter "test.txt" -Recurse | Foreach-Object { 
    #Has to loop words because Select-String -Pattern "blue","red" won't return match for both pattern. It stops on a line after first match
    foreach ($word in $words) {
        $_ | Select-String -Pattern $word |
        #Select the properties we care about
        Select-Object Path, Line, Pattern, LineNumber, @{n="Index";e={$_.Matches[0].Index}}
    }
} |
#Sort by File (to keep file-matches together), then LineNumber and Index to get the order of matches
Sort-Object Path, LineNumber, Index |
Export-Csv -NoTypeInformation -Path Results.csv -Encoding UTF8

Results.csv

"Path","Line","Pattern","LineNumber","Index"
"C:\Users\frode\Downloads\test.txt","file1.txt,blue,red","blue","3","10"
"C:\Users\frode\Downloads\test.txt","file1.txt,blue,red","red","3","15"
"C:\Users\frode\Downloads\test.txt","file2.txt,red, blue","red","4","10"
"C:\Users\frode\Downloads\test.txt","file2.txt,red, blue","blue","4","15"
"C:\Users\frode\Downloads\test.txt","file4.txt,purple,red","purple","6","10"
"C:\Users\frode\Downloads\test.txt","file4.txt,purple,red","red","6","17"
"C:\Users\frode\Downloads\test.txt","file5.txt,purple,","purple","7","10"