Python pdf2image:隐藏控制台

时间:2018-08-24 21:23:09

标签: python pdf poppler

我正在使用pdf2image,它使用poppler将PDF转换为图像。但是,当您在Python中使用它时,它每次都会打开一个新的控制台来转换PDF。有没有办法隐藏此控制台?

for file_name in os.listdir(path):
  if file_name.endswith('.pdf'):
    pages = convert_from_path(path + file_name, thread_count=4)
    idx = 1
    for page in pages:
      page.save(file_name + '-page-' + str(idx) + '.jpg', 'JPEG')
      idx += 1

2 个答案:

答案 0 :(得分:0)

是的,但是您可能不喜欢它。您必须稍微修改convert_from_path函数。当前该功能如下:

import os
import re
import tempfile
import uuid

from io import BytesIO
from subprocess import Popen, PIPE
from PIL import Image

def convert_from_path(pdf_path, dpi=200, output_folder=None, first_page=None, last_page=None, fmt='ppm', thread_count=1, userpw=None):   
    page_count = __page_count(pdf_path, userpw)

    if thread_count < 1:
        thread_count = 1

    if first_page is None:
        first_page = 1

    if last_page is None or last_page > page_count:
        last_page = page_count

    # Recalculate page count based on first and last page
    page_count = last_page - first_page + 1

    if thread_count > page_count:
        thread_count = page_count

    reminder = page_count % thread_count
    current_page = first_page
    processes = []
    for _ in range(thread_count):
        # A unique identifier for our files if the directory is not empty
        uid = str(uuid.uuid4())
        # Get the number of pages the thread will be processing
        thread_page_count = page_count // thread_count + int(reminder > 0)
        # Build the command accordingly
        args, parse_buffer_func = __build_command(['pdftoppm', '-r', str(dpi), pdf_path], output_folder, current_page, current_page + thread_page_count - 1, fmt, uid, userpw)
        # Update page values
        current_page = current_page + thread_page_count
        reminder -= int(reminder > 0)
        # Spawn the process and save its uuid
        processes.append((uid, Popen(args, stdout=PIPE, stderr=PIPE)))

    images = []
    for uid, proc in processes:
        data, _ = proc.communicate()

        if output_folder is not None:
            images += __load_from_output_folder(output_folder, uid)
        else:
            images += parse_buffer_func(data)

    return images

我们感兴趣的行是:

processes.append((uid, Popen(args, stdout=PIPE, stderr=PIPE)))

我们需要明确告知Popen不要像这样显示控制台窗口:

startupinfo=None
if platform_name == 'Windows':
  # this startupinfo structure prevents a console window from popping up on Windows
  startupinfo = subprocess.STARTUPINFO()
  startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
processes.append((uid, Popen(args, stdout=PIPE, stderr=PIPE, startupinfo=startupinfo)))

某些版本的python 2.7可能需要使用:

 startupinfo=None
    if platform_name == 'Windows':
      # this startupinfo structure prevents a console window from popping up on Windows
      startupinfo = subprocess.STARTUPINFO()
      startupinfo.dwFlags |= subprocess._subprocess.STARTF_USESHOWWINDOW
    processes.append((uid, Popen(args, stdout=PIPE, stderr=PIPE, startupinfo=startupinfo)))

答案 1 :(得分:0)

其他信息:

为了在从Windows中的已编译可执行文件运行时完全隐藏控制台窗口,需要修补pdf2img.py中的三个函数,其中Popen被称为:

  1. convert_from_path()
  2. pdfinfo_from_path()
  3. _get_poppler_version()

例如,来自:

Popen(命令,env = env,stdout = PIPE,stderr = PIPE)

收件人:

Popen(命令,env = env,stdout = PIPE,stderr = PIPE,creationflags = 0x08000000)

我在Windows中使用Python 3.8,而0x08000000是新的子进程.Python 3.7+中的CREATE_NO_WINDOW标志