Question

我需要从签名中检索layer2文本。如何使用itextsharp获取描述（在签名图像下）？下面是我用来获取签名日期和用户名的代码：

        PdfReader reader = new PdfReader(pdfPath, System.Text.Encoding.UTF8.GetBytes(MASTER_PDF_PASSWORD));
        using (MemoryStream memoryStream = new MemoryStream())
        {
            PdfStamper stamper = new PdfStamper(reader, memoryStream);
            AcroFields acroFields = stamper.AcroFields;
            List<String> names = acroFields.GetSignatureNames();
            foreach (String name in names)
            {
                PdfPKCS7 pk = acroFields.VerifySignature(name);
                String userName = PdfPKCS7.GetSubjectFields(pk.SigningCertificate).GetField("CN");
                Console.WriteLine("Sign Date: " + pk.SignDate.ToString() + " Name: " + userName);
               // Here i need to retrieve the description underneath the signature image
            }
            reader.RemoveUnusedObjects();
            reader.Close();
            stamper.Writer.CloseStream = false;
            if (stamper != null)
            {
                stamper.Close();
            }
        }

及以下是我用来设置描述的代码

PdfStamper st = PdfStamper.CreateSignature(reader, memoryStream, '\0', null, true);
PdfSignatureAppearance sap = st.SignatureAppearance;
sap.Render = PdfSignatureAppearance.SignatureRender.GraphicAndDescription;
sap.Layer2Font = font;
sap.Layer2Text = "Some text that i want to retrieve";

谢谢。

Answer 1

请查看以下PDF：signature_n2.pdf。它包含一个签名，在n2层中包含以下文本：

This document was signed by Bruno
Specimen.

在我们编写代码以提取此文本之前，我们应该使用iText RUPS来查看PDF的内部结构，以便我们可以找到存储此/n2图层的位置：

enter image description here

根据这些信息，我们可以开始编写代码。请参阅GetN2fromSig示例：

public static void main(String[] args) throws IOException {
    PdfReader reader = new PdfReader(SRC);
    AcroFields fields = reader.getAcroFields();
    Item item = fields.getFieldItem("Signature1");
    PdfDictionary widget = item.getWidget(0);
    PdfDictionary ap = widget.getAsDict(PdfName.AP);
    PdfStream normal = ap.getAsStream(PdfName.N);
    PdfDictionary resources = normal.getAsDict(PdfName.RESOURCES);
    PdfDictionary xobject = resources.getAsDict(PdfName.XOBJECT);
    PdfStream frm = xobject.getAsStream(PdfName.FRM);
    PdfDictionary res = frm.getAsDict(PdfName.RESOURCES);
    PdfDictionary xobj = res.getAsDict(PdfName.XOBJECT);
    PRStream n2 = (PRStream) xobj.getAsStream(PdfName.N2);
    byte[] stream = PdfReader.getStreamBytes(n2);
    System.out.println(new String(stream));
}

我们获取名称为"signature1"的签名字段的窗口小部件注释。根据RUPS的信息，我们知道我们必须获得正常（/Resources）外观（/N）的资源（/AP）。在/XObjects字典中，我们将找到名为/FRM的表单XObject。此XObject还有一些/Resources，更具体地说是两个/XObject，一个名为/n0，另一个名为/n2。

我们获取/n2对象的流，然后将其转换为未压缩的byte[]。当我们将此数组打印为String时，我们得到以下结果：

BT
1 0 0 1 0 49.55 Tm
/F1 12 Tf
(This document was signed by Bruno)Tj
1 0 0 1 0 31.55 Tm
(Specimen.)Tj
ET

这是PDF语法。 BT和ET代表“开始文字”和“结束文字”。 Tm运算符设置文本矩阵。 Tf运算符设置字体。 Tj显示由(和)分隔的字符串。如果您想要纯文本，仅提取括号之间的文本就足够了。

Answer 2

虽然Bruno以包含＆＃34;第2层＆＃34;的PDF开头解决了这个问题，但请允许我先说明使用这些＆＃34;签名层＆＃34;在PDF签名外观是不 PDF规范，规范实际上根本不知道这些层！因此，如果您尝试解析特定图层，则可能找不到这样的图层＆＃34;或者更糟糕的是，找到一个看起来像那个包含错误数据的图层（一个名为 n2 的XObject）的东西。

尽管如此，无论您是从第2层查找文本还是从签名外观中查找文本，都可以使用iTextSharp文本提取功能。我使用Bruno的代码作为检索 n2 图层的基础。

public static void ExtractSignatureTextFromFile(FileInfo file)
{
    try
    {
        Console.Out.Write("File: {0}\n", file);
        using (var pdfReader = new PdfReader(file.FullName))
        {
            AcroFields fields = pdfReader.AcroFields;
            foreach (string name in fields.GetSignatureNames())
            {
                Console.Out.Write("  Signature: {0}\n", name);
                iTextSharp.text.pdf.AcroFields.Item item = fields.GetFieldItem(name);
                PdfDictionary widget = item.GetWidget(0);
                PdfDictionary ap = widget.GetAsDict(PdfName.AP);
                if (ap == null)
                    continue;
                PdfStream normal = ap.GetAsStream(PdfName.N);
                if (normal == null)
                    continue;
                Console.Out.Write("    Content of normal appearance: {0}\n", extractText(normal));

                PdfDictionary resources = normal.GetAsDict(PdfName.RESOURCES);
                if (resources == null)
                    continue;
                PdfDictionary xobject = resources.GetAsDict(PdfName.XOBJECT);
                if (xobject == null)
                    continue;
                PdfStream frm = xobject.GetAsStream(PdfName.FRM);
                if (frm == null)
                    continue;
                PdfDictionary res = frm.GetAsDict(PdfName.RESOURCES);
                if (res == null)
                    continue;
                PdfDictionary xobj = res.GetAsDict(PdfName.XOBJECT);
                if (xobj == null)
                    continue;
                PRStream n2 = (PRStream) xobj.GetAsStream(PdfName.N2);
                if (n2 == null)
                    continue;
                Console.Out.Write("    Content of normal appearance, layer 2: {0}\n", extractText(n2));
            }
        }
    }
    catch (Exception ex)
    {
        Console.Error.Write("Error... " + ex.StackTrace);
    }
}

public static String extractText(PdfStream xObject)
{
    PdfDictionary resources = xObject.GetAsDict(PdfName.RESOURCES);
    ITextExtractionStrategy strategy = new LocationTextExtractionStrategy();

    PdfContentStreamProcessor processor = new PdfContentStreamProcessor(strategy);
    processor.ProcessContent(ContentByteUtils.GetContentBytesFromContentObject(xObject), resources);
    return strategy.GetResultantText();
}

对于样本文件signature_n2.pdf，布鲁诺使用了你：

File: ...\signature_n2.pdf
  Signature: Signature1
    Content of normal appearance: This document was signed by Bruno
Specimen.
    Content of normal appearance, layer 2: This document was signed by Bruno
Specimen.

由于此示例使用OP期望的第2层，因此它已包含相关文本。

使用itextsharp从签名图像中获取Layer2文本（签名描述）

2 个答案: