使用PowerShell,RegEx搜索查找并验证文本文件中的路径字符串

时间:2014-01-28 20:23:49

标签: regex powershell

第一次在这里发帖,我会尽量清楚和详细,但如果我在搜索这些电路板时错过了现有的答案,请保持温和。

首先,问题:

  1. 如何排除包含特定关键字(“fastcopy”)的RegEx响应
  2. 如何包含不以文件名/通配符结尾的路径结果
  3. 我正在处理一组与批处理文件非常相似的文本文件。它们是纯文本,包含标题行,包含服务器上文件路径的行以及注释行。注释行以分号(;)开头,因此很容易排除。这些路径都应该以变量%INSTDIR%开头,但它们可能会或者可能没有路径周围的引号,并且它们可能会或可能不会在路径后面有执行选项。最后一个注意事项......公司使用FastCopy.exe从网络中转储文件/文件夹,在这样的行中我想返回正在复制的文件夹/文件,而不是包含fastcopy.exe的路径。

    以下是一个样本(显示潜在问题的大样本):

    [Installing .NET 3.5 Hotfix KB943326 for App1]
    ; *** Added NET 3.5 SP1 hotfix KB943326: resolves App1 hidden menus force laptop re-booting
    1 = %INSTDIR%\ToolShare$\Sample_Toolbox\applications\.NET_3.5_Hotfix_KB943326\WindowsXP-KB943326-x86-ENU.exe /quiet /norestart
    
    [Installing Agent 5.3.1]
    1 = %INSTDIR%\ToolShare$\Sample_Toolbox\applications\AGenT_531_2.0\w7wxp_ze_20\install.exe
    
    [Installing APR Manager 2.1]
    1 = %INSTDIR%\ToolShare$\Sample_Toolbox\applications\APRManager_21_Updated_2.0\wviwxp_ze_20\install.exe
    
    [Installing Scope Simulator]
    1 = MD "C:\Temp\scope_simulator_10"
    2 =  start /wait /high %INSTDIR%\ToolShare$\Site_Toolbox\Custom_Scripts\Source\fastcopy.exe /auto_close /no_confirm_del /no_confirm_stop /log=FALSE /open_window /force_start /force_close /stream=FALSE /cmd=diff "%INSTDIR%\ToolShare$\Sample_Toolbox\applications\scope_simulator_10" /to="C:\Temp\scope_simulator_10"
    3 = "C:\Temp\scope_simulator_10\w7wxp_ze_10\Install.exe"
    4 = RD "C:\temp\scope_simulator_10" /q /s
    
    [Installing Log Analyzer Offline 2.6.1]
    1 = %INSTDIR%\ToolShare$\Sample_Toolbox\applications\Log_Analyzer_Offline_261\wxp_ze_10\install.exe
    
    [Installing Data Migration Script]
    1 = MD "C:\Temp\Data Migration"
    2 = xcopy "%INSTDIR%\ToolShare$\Sample_Toolbox\Support\Data Migration\*.*" "C:\Temp\Data Migration" /y /e
    3 = xcopy "%INSTDIR%\ToolShare$\Sample_Toolbox\Support\Data Migration\Data Migration.lnk" C:\DOCUME~1\ALLUSE~1\Desktop\ /Y
    

    我已将它设置为拉出'dir \\ UNCPath \ * .ini'然后循环执行ForEach($ INI in $ Results)位。我一直在循环中使用的线来尝试从每一行拉出路径:

    gc $ini|?{!($_ -match "^;") -and ($_ -match "%INST[^`"]*?\\.*(\.\w{3}|\.\*)(?=`"|\s|\Z)")}|%{$TestPath = $Matches[0].replace("%INSTDIR%","\\ServerName1");if(test-path $testpath){write-host "  [OK]    " -foregroundcolor Green -NoNewline}else{write-host "[Missing] " -ForegroundColor red -NoNewline};write-host "$testpath"}
    

    这让我几乎得到了我想要的一切。它没有做的是得到任何不以。*或标准3字符扩展名(.exe,.cmd,.jar等)结尾的东西。此外,它还会启动fastcopy路径而不是尝试复制的路径。

    我想要的结果是什么:

    %INSTDIR%\ToolShare$\Sample_Toolbox\applications\.NET_3.5_Hotfix_KB943326\WindowsXP-KB943326-x86-ENU.exe
    %INSTDIR%\ToolShare$\Sample_Toolbox\applications\AGenT_531_2.0\w7wxp_ze_20\install.exe
    %INSTDIR%\ToolShare$\Sample_Toolbox\applications\APRManager_21_Updated_2.0\wviwxp_ze_20\install.exe
    %INSTDIR%\ToolShare$\Sample_Toolbox\applications\scope_simulator_10
    %INSTDIR%\ToolShare$\Sample_Toolbox\applications\Log_Analyzer_Offline_261\wxp_ze_10\install.exe
    %INSTDIR%\ToolShare$\Sample_Toolbox\Support\Data Migration\*.*
    %INSTDIR%\ToolShare$\Sample_Toolbox\Support\Data Migration\Data Migration.lnk
    

    我没有得到第二个结果(相反,我获得了FastCopy路径,但即使我从行中剥离Fastcopy并且只有所需的路径它也不会返回它)。欢迎任何建议。

1 个答案:

答案 0 :(得分:1)

以下脚本应该可以正常工作。

$paths = Get-Content $ini | Foreach {
    if ($_ -match "^(?=[^;]).*?(?<delimiter>[""' ])(?<path>%INSTDIR%(?!.*?fastcopy.exe).*?)(?:\1|$)")
    {
        Write-Output $Matches["path"]
    }
}

$paths变量现在将包含所请求的所有路径。请注意,如果任何字符串在路径中的任何位置包含“fastcopy.exe”文字字符串,则此正则表达式将无法找到它。

尝试解释正则表达式:

^ - match the start of the line
(?=[^;]) - positive lookahead verifying that the line does not start with a semicolon
.*? - any character, as few as possible (to remove all characters before the path we want to match)
(?<delimiter>["' ]) - named group verifying whether the path is surrounded by space, a quotation character or a apostrophe.
(?<path> - start a named capturing group for capturing the "path"
    %INSTDIR% - matches the literal string '%INSTDIR%'
    (?!.*?fastcopy.exe) - negative lookahead verifying that the part of the line we're trying to match (which has started with %INSTDIR%) doesn't contain the word fastcopy.exe anywhere later in the string (the second time the %INSTDIR% occurs on the fastcopy line, the rest of the line does not contain the fastcopy.exe literal string).
    .*? - matches any character, as few as possible, to make sure that we stop as soon as we find a matching delimiter character below
) - ends the named capturing group "path"
(?:\1|$) - matches (in a non-capturing group) the character found by the delimiter group above (to match a quotation character, apostrophe or space, depending on what character was immediately before the %INSTDIR% literal string), or the end of the line.

如果有任何不清楚的地方,请在下面添加评论,要求澄清。