在PDF中查找并替换字符串

时间:2019-01-30 18:26:54

标签: c# .net pdf pdfsharp

我正在寻找一种方法来替换C#中pdf中的文本。用例是我们有一个需要签署pdf的客户端,并且我们希望在下载之前预先填充一些字段。日期,名称,标题等。 我发现了一些可能的选项,例如PDFSharp,但是我似乎找不到基于文本进行搜索的方法。

到目前为止我发现的资源是:

Find a word in PDF using PDFSharp

https://forum.pdfsharp.net/viewtopic.php?p=4010

但是,我无法让他们在我的用例中工作。任何帮助将不胜感激。

更新 这是我一直在尝试搜索和替换的样板代码:

String toFind = 'client-title';
String toReplace = 'John Doe';
PdfSharp.Pdf.PdfDocument PDFDoc = PdfReader.Open("path/to/original/file.pdf", PdfDocumentOpenMode.Import);
PdfSharp.Pdf.PdfDocument PDFNewDoc = new PdfSharp.Pdf.PdfDocument();

for(int i = 0; i < PDFDoc.Pages.Count; i++)
{
    // Find toFind string and replace with toReplace string

    PDFNewDoc.AddPage(PDFDoc.Pages[i]);
}
PDFNewDoc.Save("path/to/new/file.pdf");

1 个答案:

答案 0 :(得分:-1)

我下面的示例只是将“Hello”替换为“Hola”

class Program
    {
        static void Main(string[] args)
        {
            string originalPdf = @"C:\origPdf.pdf";

            CreatePdf(originalPdf);

            using (var doc = PdfReader.Open(originalPdf, PdfDocumentOpenMode.Modify))
            {
                var page = doc.Pages[0];
                var contents = ContentReader.ReadContent(page);

                ReplaceText(contents, "Hello", "Hola");
                page.Contents.ReplaceContent(contents);

                doc.Pages.Remove(page);
                doc.AddPage().Contents.ReplaceContent(contents);
               
                doc.Save(originalPdf);
            }

            Process.Start(originalPdf);

        }

        // Code from http://www.pdfsharp.net/wiki/HelloWorld-sample.ashx
        public static void CreatePdf(string filename)
        {
            // Create a new PDF document
            PdfDocument document = new PdfDocument();
            document.Info.Title = "Created with PDFsharp";

            // Create an empty page
            PdfPage page = document.AddPage();

            // Get an XGraphics object for drawing
            XGraphics gfx = XGraphics.FromPdfPage(page);

            // Create a font
            XFont font = new XFont("Verdana", 20, XFontStyle.BoldItalic, new XPdfFontOptions(PdfFontEncoding.WinAnsi));

            // Draw the text
            gfx.DrawString("Hello, World!", font, XBrushes.Black,
              new XRect(0, 0, page.Width, page.Height),
              XStringFormats.Center);

            // Save the document...
            document.Save(filename);
            // ...and start a viewer.
        }

        // Please refer to the pdf tech specs on what all entails in the content stream
        // https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
        public static void ReplaceText(CSequence contents, string searchText, string replaceText)
        {
            // Iterate thru each content items. Each item may or may not contain the entire
            // word if there are different stylings (ex: bold parts of the word) applied to a word.
            // So you may have to replace a character at a time.
            for (int i = 0; i < contents.Count; i++)
            {
                if (contents[i] is COperator)
                {
                    var cOp = contents[i] as COperator;
                    for (int j = 0; j < cOp.Operands.Count; j++)
                    {
                        if (cOp.OpCode.Name == OpCodeName.Tj.ToString() ||
                            cOp.OpCode.Name == OpCodeName.TJ.ToString())
                        {
                            if (cOp.Operands[j] is CString)
                            {
                                var cString = cOp.Operands[j] as CString;
                                if (cString.Value.Contains(searchText))
                                {
                                    cString.Value = cString.Value.Replace(searchText, replaceText);
                                }

                            }
                        }
                    }


                }
            }


        }
    }```