Subprocess.run在Docker中不起作用,特别是对于pdftotext(IO错误,找不到文件)

时间:2019-09-22 01:29:47

标签: python docker subprocess python-unittest pdftotext

这在本地工作,但不适用于Docker容器。我正在尝试在docker容器中运行pdftotext,然后在pdf文件上对其进行单元测试。我不确定是否误解了subprocess.run所需的参数。是不是找不到pdf文件的目录,还是子进程调用不起作用,还是docker问题?

我运行docker-compose up myproject python3 -m unittest

我的文件结构:

├── myproject
├── ├── extract.py
│   ├── tests
│   │   ├── testExtract.py
│   │   ├── testfiles
|   |   |   ├── sample.pdf

pdftotext方法:

def extract(filepath)
    text = subprocess.run(['pdftotext', filepath],
                                stdout=PIPE,
                                stderr=STDOUT)
    text = str(fullText.stdout)
    return text

testExtract.py中的测试方法:

testGetText(self):
    expected = "b'Grab all text from this sentence.'"
    result = extract('./testfiles/sample.pdf')
    self.assertEqual(result, expected)

sample.pdf仅包含以上句子。

stderr设置为STDOUT时,出现以下IO错误。如果设置为stderr=subprocess.PIPE,我将得到一个空的二进制字符串。 ""

回溯:

FAIL: testGetText 
----------------------------------------------------------------------
Traceback (most recent call last):

AssertionError: 'b"I/O Error: Couldn\'t open file \'./tes[51 chars]ry."' != "b'Grab all text from this sentence.'"
- b"I/O Error: Couldn't open file './testFiles/1SentenceFile.pdf': No such file or directory."
+ b'Grab all text from this sentence.'

编辑: docker-compose.yml

version: "3"

services:
    myproject:
        build:
            context: .
            dockerfile: ./myproject/Dockerfile
        depends_on:
            - database
        volumes:
            #- ./otherproject/logs:/code/logs
              # Dev specific
            - ./myproject:/code
        stdin_open: true
        tty: true
    database:
        # Possibly could do configurations here and skip database DOCKERFILE
        build:
            context: .
            dockerfile: ./database/Dockerfile
        environment:
            POSTGRES_USER_FILE: /secrets/dbUser.txt
            POSTGRES_PASSWORD_FILE: /secrets/dbPassword.txt
        ports:
            - "8765:5432"
        volumes:
            - otherdata:/var/lib/postgresql/data


myproject / dockerfile:

FROM python:3.7.3-stretch

RUN apt-get update && apt-get install -y \
    poppler-utils

ARG moduleDir=myproject

WORKDIR /code

COPY secrets/ /secrets

# COPY $moduleDir/myproject/ ./

# Is executed before bind mount is implemented so need to COPY.
COPY $moduleDir/requirements/common.txt requirements/common.txt
COPY $moduleDir/requirements/development.txt requirements/development.txt
COPY .pylintrc ./

RUN pip install -r requirements/development.txt

0 个答案:

没有答案