在pdf文件中查找文本

时间:2019-06-11 19:04:05

标签: powershell itext

Add-Type -Path "C:\Users\barm\.nuget\packages\itextsharp\5.5.10\lib\itextsharp.dll"
$source = 'C:\test'
$destination = 'C:\test2'
$keyword1 = "K0211"

$pdfs = Get-ChildItem -Path $source | Where-Object {$_.Name -like '*.pdf'} 

foreach($pdf in $pdfs) {

    Write-Host "processing -" $pdf.FullName
    $path = $pdf.FullName
    
    # prepare the pdf
    $reader = New-Object iTextSharp.text.pdf.pdfreader -ArgumentList $pdf.FullName
    }
    # for each page
    for($page = 1; $page -le $reader.NumberOfPages; $page++) {

        # set the page text
        $pageText = [iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($reader,$page).Split([char]0x000A)

        # if the page text contains any of the keywords we're evaluating
        foreach($keyword in $keywords) {
            if($pageText -match $keyword) {
                $response = @{
                    keyword = $keyword
                    file = $pdf.FullName
                    page = $page
                }
                $results += New-Object PSObject -Property $response
            }
            }
    }
    $reader.Close()
}

我想在带有powershell和itextsharp的pdf文件中找到变量$ keyword1(K0211),我尝试了几个脚本,但无法正常工作。有人可以帮我吗?

0 个答案:

没有答案