Docker-在可执行的Python脚本上导入本地模块

时间:2019-06-19 15:51:18

标签: python docker docker-compose dockerfile

首先,我是Docker的新手,如果我的问题很愚蠢或介意,对不起。

我正在构建一个多容器应用程序,但在显示其在本地运行良好的服务时遇到了问题。这是我的一些Docker arquitecture:

Docker Compose

version: '2'

services:
  dashboard:
    build: demo-dashboard/
    ports:
     - "8080:8080"
    environment:
      - ES_ENDPOINT_EXTERNAL=http://localhost:9200
      - http.cors.enabled=true
      - http.cors.allow-origin=ES_ENDPOINT_EXTERNAL
      - http.cors.allow-headers=Content-Type, Access-Control-Allow-Headers, Authorization, X-Requested-With
      - http.cors.allow-credentials=true
    volumes:
     - ./demo-dashboard:/usr/src/app
    networks:
      - dashboard-network

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:6.7.0
    environment:
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - http.cors.enabled=true
      - http.cors.allow-origin=http://localhost:8080
      - http.cors.allow-headers=Content-Type, Access-Control-Allow-Headers, Authorization, X-Requested-With
      - http.cors.allow-credentials=true
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    mem_limit: 1g
    cap_add:
      - IPC_LOCK
    volumes:
      - esdata1:/usr/share/elasticsearch/data
    networks:
      - dashboard-network
    ports:
      - 9200:9200

  orchestrator:
    image: orchestrator-mabsed
    build: orchestrator/
    volumes:
      - './orchestrator/tasks.py:/soneti/tasks.py'
    ports:
     - "8082:8082"
    environment:
      ES_HOST: 'elasticsearch'
    tty: true

volumes:
  esdata1:
    driver: local

networks:
  dashboard-network:
    driver: bridge

出现问题的服务是协调器,其 Dockerfile 如下所示:

FROM python:3.6

COPY requirements.txt .
RUN pip install --user -r requirements.txt

COPY tasks.py .
COPY detector/ .
COPY filter/ .
COPY lemmatizer/ .

COPY my_sched.py .
RUN python my_sched.py

my_sched.py是一个调度程序文件,该文件每5分钟运行一次,并启动tasks.py文件中包含的Luigi管道,检查是否满足某些要求。这是my_sched.py文件

import sched, time
import sys
import subprocess

s = sched.scheduler(time.time, time.sleep)

def cron():
    s.enter(5*60, 1, cron, [])
    command = '{} -m luigi --local-scheduler --module tasks Main'.format(sys.executable)
    output = subprocess.check_output(command.split(), shell= False)

s.enter(5, 1, cron, [])
s.run()

有一个问题,例如在tasks.py文件中,我需要导入本地文件夹filter中包含的组件,并且似乎Docker容器找不到它。

tasks.py导入部分如下所示:

import sys
sys.dont_write_bytecode = True

# orchestration
import luigi
from luigi.contrib.esindex import CopyToIndex
import datetime

# filter
import os
import csv
import json
from filter.filter import filter_spam

# lemmatizer
from lemmatizer.lemmatizer import lemmatize
from cube.api import Cube

# detection
sys.path.insert(0, './detector/')
from detect_events import main as detect_events

lemmatizer = Cube(verbose=True)
lemmatizer.load("es", tokenization=False, parsing=False)
...

这是docker-compose up的输出:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/.local/lib/python3.6/site-packages/luigi/__main__.py", line 20, in <module>
    luigi_run()
  File "/root/.local/lib/python3.6/site-packages/luigi/cmdline.py", line 9, in luigi_run
    run_with_retcodes(argv)
  File "/root/.local/lib/python3.6/site-packages/luigi/retcodes.py", line 70, in run_with_retcodes
    with luigi.cmdline_parser.CmdlineParser.global_instance(argv):
  File "/usr/local/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/root/.local/lib/python3.6/site-packages/luigi/cmdline_parser.py", line 52, in global_instance
    new_value = CmdlineParser(cmdline_args)
  File "/root/.local/lib/python3.6/site-packages/luigi/cmdline_parser.py", line 64, in __init__
    self._attempt_load_module(known_args)
  File "/root/.local/lib/python3.6/site-packages/luigi/cmdline_parser.py", line 142, in _attempt_load_module
    __import__(module)
  File "/tasks.py", line 13, in <module>
    from filter.filter import filter_spam
ModuleNotFoundError: No module named 'filter.filter'; 'filter' is not a package
Traceback (most recent call last):
  File "my_sched.py", line 13, in <module>
    s.run()
  File "/usr/local/lib/python3.6/sched.py", line 154, in run
    action(*argument, **kwargs)
  File "my_sched.py", line 10, in cron
    output = subprocess.check_output(command.split(), shell= False)
  File "/usr/local/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/local/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/usr/local/bin/python', '-m', 'luigi', '--local-scheduler', '--module', 'tasks', 'Main']' returned non-zero exit status 1.
ERROR: Service 'orchestrator' failed to build: The command '/bin/sh -c python my_sched.py' returned a non-zero code: 1

最后,总体了解工作目录可能会有所帮助:

MABSED/
|_ docker-compose.yml
|_ ...
|_ orchestrator/
   |_ requirements.txt
   |_ tasks.py
   |_ my_sched.py
   |_ data/
   |_ detector/
   |_ filter/
      |_ filter.py
      |_ __init__.py
   |_ lemmatizer/

最后只是强调该模型可以在本地完美运行,而问题出在尝试将其转换为Docker时。

1 个答案:

答案 0 :(得分:0)

浏览正在运行的容器时,我发现没有正确创建文件夹,并且我仅复制其内容而不是文件夹本身,因此路径filter.filter不存在。

一种可能的解决方案是编辑服务Dockerfile并在一个命令中复制所有内容,而不是逐个文件复制,如下所示:

COPY . /

或者如果您不想复制所有文件和某些文件夹,则可以通过在目标路径上指定文件夹来解决此问题:

COPY filter/ filter/