Question

当您在浏览器中检查pdf查看器页面时，有一个html结构，但urllib2和请求都没有返回，BS4进入infite循环。

我只想要页面的标题（在头部）。

Answer 1

如果您正在使用Mozilla的pdf.js，那么您应该可以执行此操作via the PDF.js API, as detailed in this Issue.

pdf.info.get('Title')

或

new Metadata(pdf.catalog.metadata)
metadata.get('dc:title')