Question

我从How to redact a large rectangle of a PDF by iTextSharp?

中提取了代码

并生成：

    iTextSharp.text.pdf.PdfReader reader;
    reader = new iTextSharp.text.pdf.PdfReader(new System.IO.FileStream(txtPDFFile.Text, System.IO.FileMode.Open));
    string path = System.IO.Path.GetDirectoryName(txtPDFFile.Text);
    System.IO.Stream fsOut = new System.IO.FileStream(System.IO.Path.Combine(path,"redacted.pdf"), System.IO.FileMode.OpenOrCreate);
    iTextSharp.text.pdf.PdfStamper stamper = new iTextSharp.text.pdf.PdfStamper(reader, fsOut);
List<iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpLocation> cleanUpLocations = new    List<iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpLocation>();
        cleanUpLocations.Add(new iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpLocation(1, new iTextSharp.text.Rectangle(77f, 77f, 200f, 200f), iTextSharp.text.BaseColor.GRAY));
        iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor cleaner = new iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor(cleanUpLocations, stamper);
        cleaner.CleanUp();
        stamper.Close();
        reader.Close();

所以我从链接的文章中选择了我应该使用的不同输入文件。

但是在cleaner.CleanUp（）中，我得到了一个未找到的对象引用：

   at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpContentOperator.Invoke(PdfContentStreamProcessor pdfContentStreamProcessor, PdfLiteral oper, List`1 operands)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.InvokeOperator(PdfLiteral oper, List`1 operands)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.ProcessContent(Byte[] contentBytes, PdfDictionary resources)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.FormXObjectDoHandler.HandleXObject(PdfContentStreamProcessor processor, PdfStream stream, PdfIndirectReference refi)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.DisplayXObject(PdfName xobjectName)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.Do.Invoke(PdfContentStreamProcessor processor, PdfLiteral oper, List`1 operands)
   at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpContentOperator.Invoke(PdfContentStreamProcessor pdfContentStreamProcessor, PdfLiteral oper, List`1 operands)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.InvokeOperator(PdfLiteral oper, List`1 operands)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.ProcessContent(Byte[] contentBytes, PdfDictionary resources)
   at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor.CleanUpPage(Int32 pageNum, IList`1 cleanUpLocations)
   at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor.CleanUp()
   at Com.EDS.DocSol.PDFExtract.PDFExtractForm.btnRedaction_Click(Object sender, EventArgs e) in D:\Users\me\Code\PDFExtract\PDFExtract\PDFExtractForm.cs:line 106
   at System.Windows.Forms.Control.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
   at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
   at System.Windows.Forms.Control.WndProc(Message& m)
   at System.Windows.Forms.ButtonBase.WndProc(Message& m)
   at System.Windows.Forms.Button.WndProc(Message& m)
   at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
   at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
   at System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
   at System.Windows.Forms.UnsafeNativeMethods.DispatchMessageW(MSG& msg)
   at System.Windows.Forms.Application.ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32 dwComponentID, Int32 reason, Int32 pvLoopData)
   at System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(Int32 reason, ApplicationContext context)
   at System.Windows.Forms.Application.ThreadContext.RunMessageLoop(Int32 reason, ApplicationContext context)
   at System.Windows.Forms.Application.Run(Form mainForm)
   at Com.EDS.DocSol.PDFExtract.Program.Main(String[] args) in D:\Users\me\Code\PDFExtract\PDFExtract\Program.cs:line 140
   at System.AppDomain._nExecuteAssembly(Assembly assembly, String[] args)
   at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
   at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()

我不明白为什么。矩形我没有改变。我不确定那个地方是否真的有东西。我有一些代码首先添加注释，然后我试图应用它。但它也会得到相同的对象引用错误。

在上面的代码中......我是否需要在应用之前先创建一个编辑注释，或者这段代码选择我想要编辑的框并在一次通过中应用它。

我想要的矩形（它是一个地址块），实际上是：iTextSharp.text.Rectangle（45,650,200,750）;

Answer 1

关于OP的原始观察，未找到对象引用

at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpContentOperator.Invoke(PdfContentStreamProcessor pdfContentStreamProcessor, PdfLiteral oper, List`1 operands)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.InvokeOperator(PdfLiteral oper, List`1 operands)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.ProcessContent(Byte[] contentBytes, PdfDictionary resources)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.FormXObjectDoHandler.HandleXObject(PdfContentStreamProcessor processor, PdfStream stream, PdfIndirectReference refi)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.DisplayXObject(PdfName xobjectName)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.Do.Invoke(PdfContentStreamProcessor processor, PdfLiteral oper, List`1 operands)
at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpContentOperator.Invoke(PdfContentStreamProcessor pdfContentStreamProcessor, PdfLiteral oper, List`1 operands)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.InvokeOperator(PdfLiteral oper, List`1 operands)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.ProcessContent(Byte[] contentBytes, PdfDictionary resources)

清理处理器解析表单xobject的内容流时发生

：内容流中的某些指令似乎无效（很可能指令参数无效），即PDF很可能只是简单地打破了。< / p>

使用OP文档的 desensitised 版本无法重现此行为。特别是，desensitized版本在每个页面上只包含一个表单xobject，它不与编校区域相交。当将编校区域扩展为与表单xobject部分相交时，会出现异常。但它是一个不同的，清楚地表明System.Drawing.Graphics.FromImage无法处理xobject形式中显示的位图图像的格式。

因此，在脱敏过程中似乎已删除了无效的表单xobject内容。因此，为了解决手头问题的清理代码，需要原始文档。

在评论中，OP表示他也尝试以不同的方式调用清理过程，即通过向PDF添加编辑注释，然后在没有PdfCleanUpLocation的情况下调用清理过程。他添加了像这样的编辑注释：

PdfReader reader = new PdfReader(new FileStream(txtPDFFile.Text, FileMode.Open));
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(txtPDFFile.Text + ".pdf", FileMode.OpenOrCreate)))
{
    // Add the annotations
    int page = 1;
    Rectangle rect = new Rectangle(45, 650, 200, 750);
    PdfAnnotation annotation = new PdfAnnotation(stamper.Writer, rect);
    annotation.Put(PdfName.SUBTYPE, new PdfName("Redact"));
    stamper.AddAnnotation(annotation, page);
} //Using

清理现在也会遇到对象引用未找到的情况，但这一次

at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor.ExtractLocationsFromRedactAnnot(Int32 page, Int32 annotIndex, PdfDictionary annotDict)
at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor.ExtractLocationsFromRedactAnnots(Int32 page, PdfDictionary pageDict)
at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor.ExtractLocationsFromRedactAnnots()

这种情况下的原因是清理代码中的错误。由例如生成的编辑注释Adobe Reader通常包含一个附加参数 QuadPoints ，其中包含许多四边形，这些四边形指定注释矩形内部实际编辑的区域;如果此参数不存在，则整个矩形将被编辑。

此上下文中的iTextSharp具有以下代码：

PdfArray quadPoints = annotDict.GetAsArray(PdfName.QUADPOINTS);

if (quadPoints.Size != 0) {
    markedRectangles.AddRange(TranslateQuadPointsToRectangles(quadPoints));
} else { 
    ... add a range for the annotation rectangle ...
}

不幸的是，如果注释没有 QuadPoints ，annotDict.GetAsArray会返回null并且quadPoints.Size的评估会因异常而失败。它应该是

if (quadPoints != null && quadPoints.Size != 0) {

代替。

OP可以解决这个问题，方法是将 QuadPoints 条目与空数组一起添加到他的编辑中：

...
annotation.Put(PdfName.SUBTYPE, new PdfName("Redact"));
annotation.Put(PdfName.QUADPOINTS, new PdfArray()); // <<<<<<<<
stamper.AddAnnotation(annotation, page);
...

注意：这仅仅是针对此iTextSharp问题的解决方法，如果带有注释的PDF用于其他用途，则不应该执行此操作。严格来说，空 QuadPoints 条目表示没有任何内容需要编辑。

顺便说一下，OP的代码中存在一个问题：当为PdfStamper创建要写入的文件流时，他使用FileMode.OpenOrCreate：

System.IO.Stream fsOut = new System.IO.FileStream(System.IO.Path.Combine(path,"redacted.pdf"), System.IO.FileMode.OpenOrCreate);

或

using (PdfStamper stamper = new PdfStamper(reader, new FileStream(txtPDFFile.Text + ".pdf", FileMode.OpenOrCreate)))

如果已存在具有该名称的文件，该文件的长度比新PDF长，则结果将具有旧文件的大小，旧文件仍在额外空间中。即新文件有效地具有悬空垃圾内容，这会产生无效的PDF，例如Adobe Reader提供修复。

一般来说，应该使用FileMode.Create代替。从其文档：

FileMode.Create相当于请求如果文件不存在，请使用System.IO.FileMode.CreateNew;否则，请使用System.IO.FileMode.Truncate。

iTextSharp 5.5.9编校 - 找不到对象参考

1 个答案: