使用PowerShell快速搜索文件以获取正则表达式并输出到CSV

时间:2016-10-11 17:47:31

标签: regex powershell

我的目标是以递归方式搜索包含正则表达式的所有文件的目录。然后输出到CSV,其中包含完全匹配的列,另一列显示找到它们的文件。感谢用户woxxom,我开始使用IO.File,因为它显然比使用Select-String快得多。

这是一个我已经工作了很长时间并且能够通过Select-String并使用Export-Csv完成的项目,但这是一个相当缓慢的过程。

对我的新尝试缺少什么的想法?

$ResultsCSV = "C:\TEMP\Results.csv"
$Directory = "C:\TEMP\examples"
$RX = "(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?:\.|dot|\[dot\]|\[\.\])){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
$TextFiles = Get-ChildItem $Directory -Include *.txt*,*.csv*,*.rtf*,*.eml*,*.msg*,*.dat*,*.ini*,*.mht* -Recurse
$out = [Text.StringBuilder]

foreach ($FileSearched in $TextFiles) {
    $text = [IO.File]::ReadAllText($FileSearched)
    foreach ($match in ([regex]$RX).Matches($text)) {
        if (!(Test-Path $ResultsCSV)) {
            'Matches,File Path' | Out-File $ResultsCSV -Encoding ASCII
            $out.AppendLine('' + $match.value + ',' + $FileSearched.fullname)
            $match.value | Out-File $ResultsCSV -Encoding ascii -Append
            $FileSearched.Fullname | Out-File $ResultsCSV -Encoding ascii -Append
            $out.ToString() | Out-File $ResultsCSV -Encoding ascii -Append -NoNewline
       }
    }
}

1 个答案:

答案 0 :(得分:4)

您可以使用Stream进行读写来提高性能

    $ResultsCSV = "C:\TEMP\Results.csv"
    $Directory = "C:\TEMP\examples"
    $RX = "(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?:\.|dot|\[dot\]|\[\.\])){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"

    $TextFiles = Get-ChildItem $Directory -Include *.txt*,*.csv*,*.rtf*,*.eml*,*.msg*,*.dat*,*.ini*,*.mht* -Recurse

     $file2 =  new-object System.IO.StreamWriter($ResultsCSV) #output Stream
     $file2.WriteLine('Matches,File Path') # write header

    foreach ($FileSearched in $TextFiles) {   #loop over files in folder

        #    $text = [IO.File]::ReadAllText($FileSearched)
        $file = New-Object System.IO.StreamReader ($FileSearched)  # Input Stream

        while ($text = $file.ReadLine()) {      # read line by line
            foreach ($match in ([regex]$RX).Matches($text)) {   
                   # write line to output stream
                   $file2.WriteLine("{0},{1}",$match.Value, $FileSearched.fullname )  
            } #foreach $match
        }#while $file
         $file.close();  
    } #foreach  
    $file2.close()