应用错误收集

我一直在使用iTextSharp来搜索PDF，但我遇到了一个问题。有问题的PDF有以下几点：

这是测试线1
这是测试线2

我试图在每个pdf中搜索一个短语 - 在这个例子中，短语将是'test line'。我正在使用的代码是：

for ($page = 1; $page -le $reader.NumberOfPages; $page++)
        {
$strategy = new-object  iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy
$currentText = [iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($reader, $page, $strategy)

然而，这没有找到短语'test line'。当我查看iTextSharp搜索的每个页面的结果时，它看起来像：

- This    is      a   test    line    1

- This    is      a   test    line    2

是否有从PDF中提取文本并排除格式？

非常感谢保罗

使用iTextSharp搜索PDF

0 个答案: