Question

在我的项目中，我需要在python中的PDF按钮上进行操作。 PDFMiner，PyPDF等都谈论从pdf提取文本。我们如何从PDF中提取按钮和其他控件以及它们背后的操作。

Answer 1

在PDF中，按钮是一种特殊的小部件。可以使用PyMuPDF和函数Page.widgets()提取页面中的所有窗口小部件，该函数返回一个生成器，因此您可以使用类似以下内容来遍历页面上的所有窗口小部件：

import fitz # The package you need to install is actually called pymupdf


# Open the PDF
doc = fitz.open('your.pdf')
# Whichever page holds the buttons
page_number = 1 
page = doc[page_number]

# Loop over all the widgets on this page and print their type
for w in page.widgets():
    print(f"Widget has the type {w.field_type_string}")
    # Do whatever else you want to do with the widget

找到按钮后，您可以通过读取按钮的script属性来提取其背后的动作。

但是，我一直在尝试相同的操作，并且从上面链接的文档中看来，它应该可以工作，但是在PDF中，我正在查看按钮的script属性仍然是None。因此，如果这对您有用，或者您找到了另一种提取动作的方法，请分享！

如何在Python中从PDF提取按钮和动作

1 个答案: