Question

嗨，我在将pdfrw用于python时遇到了麻烦。我正在尝试用pdfrw填充PDF，我可以填充一页。 obj.pages将仅接受整数而不是切片。当前，它将仅填充指定的一页。当我在obj.page中输入第二页时，它仅填充第二页，依此类推。我需要填充四页。

import pdfrw

TEMPLATE_PATH = 'temppath.pdf'
OUTPUT_PATH = 'outpath.pdf'

ANNOT_KEY = '/Annots'
ANNOT_FIELD_KEY = '/T'
ANNOT_VAL_KEY = '/V'
ANNOT_RECT_KEY = '/Rect'
SUBTYPE_KEY = '/Subtype'
WIDGET_SUBTYPE_KEY = '/Widget'

def write_fillable_pdf(input_pdf_path, output_pdf_path, data_dict):
    template_pdf = pdfrw.PdfReader(input_pdf_path)
    annotations = template_pdf.pages[:3][ANNOT_KEY]
    for annotation in annotations:
        if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
            if annotation[ANNOT_FIELD_KEY]:
                key = annotation[ANNOT_FIELD_KEY][1:-1]
                if key in data_dict.keys():
                    annotation.update(
                        pdfrw.PdfDict(V='{}'.format(data_dict[key]))
                    )
    pdfrw.PdfWriter().write(output_pdf_path, template_pdf)

data_dict = {}

if __name__ == '__main__':
write_fillable_pdf(TEMPLATE_PATH, OUTPUT_PATH, data_dict)

当我使用切片时

annotations = template_pdf.pages[:3][ANNOT_KEY]

返回错误

TypeError: list indices must be integers or slices, not str

否则它将仅在一页上运行

annotations = template_pdf.pages[0][ANNOT_KEY]

或

annotations = template_pdf.pages[1][ANNOT_KEY]

将运行指示的页面

我遇到类似的问题： How to add text to the second page in pdf with Python, Reportlab and pdfrw?

从本文开始 https://bostata.com/post/how_to_populate_fillable_pdfs_with_python/

Answer 1

由于使用 slice pages[:3][ANNOT_KEY]时遇到问题，因此不会出现表达式pages[:3]遇到的异常-可以正常工作。但是列表的一部分是列表，语法[ANNOT_KEY]尝试使用ANNOT_KEY（它是一个字符串）索引到这个新列表中。

但是不要相信我的话；分隔线：

    annotations = template_pdf.pages[:3][ANNOT_KEY]

分为两行：

    foobar = template_pdf.pages[:3]
    annotations = foobar[ANNOT_KEY]

查看错误发生的地方。

无论如何，正如我在上面的评论中提到的那样，您也不应该使用字符串为PdfDicts编制索引-使用PdfStrings或只是使用正确的属性访问它们。

我个人不使用注释，所以我不确定您要完成什么，但是如果注释始终是列表（如果给出），则可以执行以下操作：

    annotations = []
    for page in template_pdf.pages[:3]:
        annotations.extend(page.Annots or [])

（上面or []表达式的目的是处理页面没有/ Annots的情况-因为pdfrw会为不存在的dict键返回None（以匹配语义PDF字典的行为），以确保您不尝试使用None扩展列表。）

如果多个页面可能共享任何注释，则您可能还希望对列表进行重复数据删除。

免责声明：我是pdfrw的主要作者。

pdfrw-用python填充pdf，麻烦在多个页面上使用切片

1 个答案: