Question

我有一个网站，其网址结尾为jobs / jif / id / 1-jobs / jif / id / 1298。我需要将这些页面的每一页打印为pdf。有一个陷阱。登录网站可以保护网站信息。在我弄清楚在给定URL的情况下如何打印到pdf之前，我一直忽略该问题。

我尝试使用curl，但这对我来说是死胡同。我现在正在使用pdfkit打印每页。我不喜欢使用pdfkit或python。如果我可以为此正确放置一个蝙蝠文件，那就很好了。

import pdfkit

url = 'https://registration.vtbigevent.org/committee/jobs/jif/id/'
config = pdfkit.configuration(wkhtmltopdf="C:/Program Files/wkhtmltopdf/bin")
for ids in range(1,1298):
    new_url = url + str(ids)
    pdf = str(ids) + '.pdf'
    pdfkit.from_url(new_url, pdf, configuration=config)

它应该在一个文件夹中创建1298个pdf。

实际结果是wkhtmltopdf的编译错误。

PermissionError：[Errno 13]权限被拒绝：'C：/ Program Files / wkhtmltopdf / bin'

另一个问题是，我知道这不会将pdf保存到预期的文件夹中，但是目前优先级较低。

我编辑了代码，以在pdfkit行中添加configuration = config。这是否符合我的想法？我仍然收到权限错误。

打印连续增加的URL到PDF

0 个答案: