Question

我正在开发一个C＃winform应用程序，它将pdf内容转换为文本。除了在pdf的突出显示文本中找到的内容之外，将提取所有必需的内容。请帮助获取工作样本以提取pdf中的突出显示文本。我在项目中使用iTextSharp.dll

Answer 1

假设您正在谈论评论。请试试这个：

for (int i = pageFrom; i <= pageTo; i++) {
    PdfDictionary page = reader.GetPageN(i);
    PdfArray annots = page.GetAsArray(iTextSharp.text.pdf.PdfName.ANNOTS);
    if (annots!=null)
        foreach (PdfObject annot in annots.ArrayList) {
            PdfDictionary annotation = (PdfDictionary)PdfReader.GetPdfObject(annot);
            PdfString contents = annotation.GetAsString(PdfName.CONTENTS);
            // now use the String value of contents
        }
    }
}

这是从内存中写的（我是Java开发人员，而不是C＃开发人员）。

iTextSharp PDF使用C＃读取突出显示的文本（突出显示注释）

1 个答案: