我能描述问题的最简单方法是,我们使用PDFbox只从一个从HelloSign发送给我们的PDF中删除一个字段。 (例如信用卡号码)
我在下面的第一条评论中对这个问题以及我迄今为止所做的尝试做了一个冗长的解释。
答案 0 :(得分:0)
此答案中的代码可能出现,因为它首先确定文档中的字段映射,然后允许删除文本字段的任意组合。但请注意,它仅使用此问题中的单个示例PDF进行开发。因此,我无法确定我是否正确理解了HelloSign标记字段的方式,特别是HelloSign填充这些字段的方式。
这个答案提供了两个类,一个分析HelloSign表单,另一个通过清除选定字段来操作它;后者依赖于前者收集的信息。这两个类都是基于PDFBox PDFTextStripper
实用程序类构建的。
该代码是为当前的PDFBox开发版本2.1.0-SNAPSHOT开发的。最有可能它适用于所有2.0.x版本。
此课程分析给定PDDocument
寻找序列
[$varname ]
似乎定义了用于放置表单字段内容的占位符,以及[def:$varname|type|req|signer|display|label]
似乎定义了占位符的属性。它创建了一个HelloSignField
个实例的集合,每个实例都描述了这样一个占位符。如果可以在占位符上找到文本,它们还包含相应字段的值。
此外,它存储页面上绘制的最后一个xobject的名称,如果样本文档是HelloSign绘制其字段内容的位置。
public class HelloSignAnalyzer extends PDFTextStripper
{
public class HelloSignField
{
public String getName()
{ return name; }
public String getValue()
{ return value; }
public float getX()
{ return x; }
public float getY()
{ return y; }
public float getWidth()
{ return width; }
public String getType()
{ return type; }
public boolean isOptional()
{ return optional; }
public String getSigner()
{ return signer; }
public String getDisplay()
{ return display; }
public String getLabel()
{ return label; }
public float getLastX()
{ return lastX; }
String name = null;
String value = "";
float x = 0, y = 0, width = 0;
String type = null;
boolean optional = false;
String signer = null;
String display = null;
String label = null;
float lastX = 0;
@Override
public String toString()
{
return String.format("[Name: '%s'; Value: `%s` Position: %s, %s; Width: %s; Type: '%s'; Optional: %s; Signer: '%s'; Display: '%s', Label: '%s']",
name, value, x, y, width, type, optional, signer, display, label);
}
void checkForValue(List<TextPosition> textPositions)
{
for (TextPosition textPosition : textPositions)
{
if (inField(textPosition))
{
float textX = textPosition.getTextMatrix().getTranslateX();
if (textX > lastX + textPosition.getWidthOfSpace() / 2 && value.length() > 0)
value += " ";
value += textPosition.getUnicode();
lastX = textX + textPosition.getWidth();
}
}
}
boolean inField(TextPosition textPosition)
{
float yPos = textPosition.getTextMatrix().getTranslateY();
float xPos = textPosition.getTextMatrix().getTranslateX();
return inField(xPos, yPos);
}
boolean inField(float xPos, float yPos)
{
if (yPos < y - 3 || yPos > y + 3)
return false;
if (xPos < x - 1 || xPos > x + width + 1)
return false;
return true;
}
}
public HelloSignAnalyzer(PDDocument pdDocument) throws IOException
{
super();
this.pdDocument = pdDocument;
}
public Map<String, HelloSignField> analyze() throws IOException
{
if (!analyzed)
{
fields = new HashMap<>();
setStartPage(pdDocument.getNumberOfPages());
getText(pdDocument);
analyzed = true;
}
return Collections.unmodifiableMap(fields);
}
public String getLastFormName()
{
return lastFormName;
}
//
// PDFTextStripper overrides
//
@Override
protected void writeString(String text, List<TextPosition> textPositions) throws IOException
{
{
for (HelloSignField field : fields.values())
{
field.checkForValue(textPositions);
}
}
int position = -1;
while ((position = text.indexOf('[', position + 1)) >= 0)
{
int endPosition = text.indexOf(']', position);
if (endPosition < 0)
continue;
if (endPosition > position + 1 && text.charAt(position + 1) == '$')
{
String fieldName = text.substring(position + 2, endPosition);
int spacePosition = fieldName.indexOf(' ');
if (spacePosition >= 0)
fieldName = fieldName.substring(0, spacePosition);
HelloSignField field = getOrCreateField(fieldName);
TextPosition start = textPositions.get(position);
field.x = start.getTextMatrix().getTranslateX();
field.y = start.getTextMatrix().getTranslateY();
TextPosition end = textPositions.get(endPosition);
field.width = end.getTextMatrix().getTranslateX() + end.getWidth() - field.x;
}
else if (endPosition > position + 5 && "def:$".equals(text.substring(position + 1, position + 6)))
{
String definition = text.substring(position + 6, endPosition);
String[] pieces = definition.split("\\|");
if (pieces.length == 0)
continue;
HelloSignField field = getOrCreateField(pieces[0]);
if (pieces.length > 1)
field.type = pieces[1];
if (pieces.length > 2)
field.optional = !"req".equals(pieces[2]);
if (pieces.length > 3)
field.signer = pieces[3];
if (pieces.length > 4)
field.display = pieces[4];
if (pieces.length > 5)
field.label = pieces[5];
}
}
super.writeString(text, textPositions);
}
@Override
protected void processOperator(Operator operator, List<COSBase> operands) throws IOException
{
String currentFormName = formName;
if (operator != null && "Do".equals(operator.getName()) && operands != null && operands.size() > 0)
{
COSBase base0 = operands.get(0);
if (base0 instanceof COSName)
{
formName = ((COSName)base0).getName();
if (currentFormName == null)
lastFormName = formName;
}
}
try
{
super.processOperator(operator, operands);
}
finally
{
formName = currentFormName;
}
}
//
// helper methods
//
HelloSignField getOrCreateField(String name)
{
HelloSignField field = fields.get(name);
if (field == null)
{
field = new HelloSignField();
field.name = name;
fields.put(name, field);
}
return field;
}
//
// inner member variables
//
final PDDocument pdDocument;
boolean analyzed = false;
Map<String, HelloSignField> fields = null;
String formName = null;
String lastFormName = null;
}
可以将HelloSignAnalyzer
应用于文档,如下所示:
PDDocument pdDocument = PDDocument.load(...);
HelloSignAnalyzer helloSignAnalyzer = new HelloSignAnalyzer(pdDocument);
Map<String, HelloSignField> fields = helloSignAnalyzer.analyze();
System.out.printf("Found %s fields:\n\n", fields.size());
for (Map.Entry<String, HelloSignField> entry : fields.entrySet())
{
System.out.printf("%s -> %s\n", entry.getKey(), entry.getValue());
}
System.out.printf("\nLast form name: %s\n", helloSignAnalyzer.getLastFormName());
(PlayWithHelloSign.java测试方法testAnalyzeInput
)
如果是OP的样本文档,则输出为
Found 8 fields: var1001 -> [Name: 'var1001'; Value: `123 Main St.` Position: 90.0, 580.0; Width: 165.53601; Type: 'text'; Optional: false; Signer: 'signer1'; Display: 'Address', Label: 'address1'] var1004 -> [Name: 'var1004'; Value: `12345` Position: 210.0, 564.0; Width: 45.53601; Type: 'text'; Optional: false; Signer: 'signer1'; Display: 'Postal Code', Label: 'zip'] var1002 -> [Name: 'var1002'; Value: `TestCity` Position: 90.0, 564.0; Width: 65.53601; Type: 'text'; Optional: false; Signer: 'signer1'; Display: 'City', Label: 'city'] var1003 -> [Name: 'var1003'; Value: `AA` Position: 161.0, 564.0; Width: 45.53601; Type: 'text'; Optional: false; Signer: 'signer1'; Display: 'State', Label: 'state'] date2 -> [Name: 'date2'; Value: `2016/12/09` Position: 397.0, 407.0; Width: 124.63202; Type: 'date'; Optional: false; Signer: 'signer2'; Display: 'null', Label: 'null'] signature1 -> [Name: 'signature1'; Value: `` Position: 88.0, 489.0; Width: 236.624; Type: 'sig'; Optional: false; Signer: 'signer1'; Display: 'null', Label: 'null'] date1 -> [Name: 'date1'; Value: `2016/12/09` Position: 397.0, 489.0; Width: 124.63202; Type: 'date'; Optional: false; Signer: 'signer1'; Display: 'null', Label: 'null'] signature2 -> [Name: 'signature2'; Value: `` Position: 88.0, 407.0; Width: 236.624; Type: 'sig'; Optional: false; Signer: 'signer2'; Display: 'null', Label: 'null'] Last form name: Xi0
此课程使用HelloSignAnalyzer
收集的信息来清除其姓名所给出的文本字段的内容。
public class HelloSignManipulator extends PDFTextStripper
{
public HelloSignManipulator(HelloSignAnalyzer helloSignAnalyzer) throws IOException
{
super();
this.helloSignAnalyzer = helloSignAnalyzer;
addOperator(new SelectiveDrawObject());
}
public void clearFields(Iterable<String> fieldNames) throws IOException
{
try
{
Map<String, HelloSignField> fieldMap = helloSignAnalyzer.analyze();
List<HelloSignField> selectedFields = new ArrayList<>();
for (String fieldName : fieldNames)
{
selectedFields.add(fieldMap.get(fieldName));
}
fields = selectedFields;
PDDocument pdDocument = helloSignAnalyzer.pdDocument;
setStartPage(pdDocument.getNumberOfPages());
getText(pdDocument);
}
finally
{
fields = null;
}
}
class SelectiveDrawObject extends OperatorProcessor
{
@Override
public void process(Operator operator, List<COSBase> arguments) throws IOException
{
if (arguments.size() < 1)
{
throw new MissingOperandException(operator, arguments);
}
COSBase base0 = arguments.get(0);
if (!(base0 instanceof COSName))
{
return;
}
COSName name = (COSName) base0;
if (replacement != null || !helloSignAnalyzer.getLastFormName().equals(name.getName()))
{
return;
}
if (context.getResources().isImageXObject(name))
{
throw new IllegalArgumentException("The form xobject to edit turned out to be an image.");
}
PDXObject xobject = context.getResources().getXObject(name);
if (xobject instanceof PDTransparencyGroup)
{
throw new IllegalArgumentException("The form xobject to edit turned out to be a transparency group.");
}
else if (xobject instanceof PDFormXObject)
{
PDFormXObject form = (PDFormXObject) xobject;
PDFormXObject formReplacement = new PDFormXObject(helloSignAnalyzer.pdDocument);
formReplacement.setBBox(form.getBBox());
formReplacement.setFormType(form.getFormType());
formReplacement.setMatrix(form.getMatrix().createAffineTransform());
formReplacement.setResources(form.getResources());
OutputStream outputStream = formReplacement.getContentStream().createOutputStream(COSName.FLATE_DECODE);
replacement = new ContentStreamWriter(outputStream);
context.showForm(form);
outputStream.close();
getResources().put(name, formReplacement);
replacement = null;
}
}
@Override
public String getName()
{
return "Do";
}
}
//
// PDFTextStripper overrides
//
@Override
protected void processOperator(Operator operator, List<COSBase> operands) throws IOException
{
if (replacement != null)
{
boolean copy = true;
if (TjTJ.contains(operator.getName()))
{
Matrix transformation = getTextMatrix().multiply(getGraphicsState().getCurrentTransformationMatrix());
float xPos = transformation.getTranslateX();
float yPos = transformation.getTranslateY();
for (HelloSignField field : fields)
{
if (field.inField(xPos, yPos))
{
copy = false;
}
}
}
if (copy)
{
replacement.writeTokens(operands);
replacement.writeToken(operator);
}
}
super.processOperator(operator, operands);
}
//
// helper methods
//
final HelloSignAnalyzer helloSignAnalyzer;
final Collection<String> TjTJ = Arrays.asList("Tj", "TJ");
Iterable<HelloSignField> fields;
ContentStreamWriter replacement = null;
}
可以按如下方式将HelloSignManipulator
应用于文档以清除单个字段:
PDDocument pdDocument = PDDocument.load(...);
HelloSignAnalyzer helloSignAnalyzer = new HelloSignAnalyzer(pdDocument);
HelloSignManipulator helloSignManipulator = new HelloSignManipulator(helloSignAnalyzer);
helloSignManipulator.clearFields(Collections.singleton("var1001"));
pdDocument.save(...);
(PlayWithHelloSign.java测试方法testClearAddress1Input
)
可以按如下方式将HelloSignManipulator
应用于文档,以便一次清除多个字段:
PDDocument pdDocument = PDDocument.load(...);
HelloSignAnalyzer helloSignAnalyzer = new HelloSignAnalyzer(pdDocument);
HelloSignManipulator helloSignManipulator = new HelloSignManipulator(helloSignAnalyzer);
helloSignManipulator.clearFields(Arrays.asList("var1004", "var1003", "date2"));
pdDocument.save(...);
(PlayWithHelloSign.java测试方法testClearZipStateDate2Input
)
可以按如下方式将HelloSignManipulator
应用于文档,以便连续清除多个字段:
PDDocument pdDocument = PDDocument.load(...);
HelloSignAnalyzer helloSignAnalyzer = new HelloSignAnalyzer(pdDocument);
HelloSignManipulator helloSignManipulator = new HelloSignManipulator(helloSignAnalyzer);
helloSignManipulator.clearFields(Collections.singleton("var1004"));
helloSignManipulator.clearFields(Collections.singleton("var1003"));
helloSignManipulator.clearFields(Collections.singleton("date2"));
pdDocument.save(...);
(PlayWithHelloSign.java测试方法testClearZipStateDate2SuccessivelyInput
)
这些课程仅仅是概念证明。一方面,它们是基于单个示例HelloSign文件构建的,因此很有可能错过了重要的细节。另一方面,有一些内在的假设,例如在HelloSignField
方法inField
中。
此外,通常操纵签名的HelloSign文件可能不是一个好主意。如果我正确地理解了他们的概念,他们会存储每个签名文档的哈希以允许验证内容,如果文档被操作如上所示,则哈希值将不再匹配。