PDFInfoNotInstalledError:无法获取页数。 poppler是否已安装且位于PATH中?

时间:2019-08-17 11:08:57

标签: python-3.x

实际上,我正在尝试将pdf文件标记为一个句子,首先我使用pypdf2,但面临数据丢失和格式不正确的问题。所以我尝试了ocr,但是将pdf转换为图像时却遇到了poppler问题  谁能帮我解决这个问题

pages = convert_from_path(PDF_file, 600)

FileNotFoundError                         Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pdf2image\pdf2image.py in _page_count(pdf_path, userpw, poppler_path)
    239             env["LD_LIBRARY_PATH"] = poppler_path + ":" + env.get("LD_LIBRARY_PATH", "")
--> 240         proc = Popen(command, env=env, stdout=PIPE, stderr=PIPE)
    241 

~\Anaconda3\lib\subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)
    774                                 errread, errwrite,
--> 775                                 restore_signals, start_new_session)
    776         except:

~\Anaconda3\lib\subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session)
   1177                                          os.fspath(cwd) if cwd is not None else None,
-> 1178                                          startupinfo)
   1179             finally:

FileNotFoundError: [WinError 2] The system cannot find the file specified

在处理上述异常期间,发生了另一个异常:

PDFInfoNotInstalledError                  Traceback (most recent call last)
<ipython-input-15-3c78fc8271dd> in <module>
----> 1 pages = convert_from_path(PDF_file, 600)

~\Anaconda3\lib\site-packages\pdf2image\pdf2image.py in convert_from_path(pdf_path, dpi, output_folder, first_page, last_page, fmt, thread_count, userpw, use_cropbox, strict, transparent, single_file, output_file, poppler_path)
     52     """
     53 
---> 54     page_count = _page_count(pdf_path, userpw, poppler_path=poppler_path)
     55 
     56     # We start by getting the output format, the buffer processing function and if we need pdftocairo

~\Anaconda3\lib\site-packages\pdf2image\pdf2image.py in _page_count(pdf_path, userpw, poppler_path)
    242         out, err = proc.communicate()
    243     except:
--> 244     raise PDFInfoNotInstalledError('Unable to get page count. Is poppler installed and in PATH?')
    245 
    246     try:

PDFInfoNotInstalledError: Unable to get page count. 

poppler是否已安装并位于PATH中?

2 个答案:

答案 0 :(得分:1)

当我在Anaconda提示符下安装以下代码时,我的代码工作正常,请检查它是否也对您有用!

conda install -c conda-forge poppler

答案 1 :(得分:0)

如果下载poppler,这非常容易。 https://github.com/oschwartz10612/poppler-windows/releases/ 解压缩并移入程序文件后 最后将环境变量设置为poppler bin文件夹。 [系统>高级系统设置>环境变量>将poppler bin作为新项添加到Path中] 说明文件: https://pypi.org/project/pdf2image/