我正在按照here中的教程尝试通过AWS Batch运行脚本。具体来说,入口点脚本是same:它是一个脚本,该脚本从S3存储桶下载要在AWS Batch中执行的代码。但是,无论我如何尝试在AWS上执行它,我总是会收到:
CannotStartContainerError: API error (400): OCI runtime create failed:
container_linux.go:348: starting container process caused "exec:
\"/usr/local/bin/fetch_and_run.sh\":
stat /usr/local/bin/fetch_and_run.sh: no such file or directory": unknown
我能够在本地启动相同的容器。
我使用以下命令从awscli启动该过程:
aws batch submit-job --job-name mss_dev --job-definition mapper \
--job-queue bio-job-queue \
--container-overrides '{"environment": \
[{"name": "BATCH_FILE_S3_URL", "value": "s3://test/myjob.sh"}, \
{"name": "BATCH_FILE_TYPE", "value": "script"}], \
"command":["/usr/local/bin/fetch_and_run.sh"]}'
我的Dockerfile如下:
FROM amazonlinux:latest
# General dependencies and user
## aws-cli installed twice (here for root, later for user)
RUN yum -y install which unzip tar wget aws-cli curl sudo
RUN yum -y groupinstall 'Development Tools'
RUN yum -y install gcc git curl make zlib-devel bzip2 bzip2-devel readline-devel sqlite sqlite-devel openssl openssl-devel
RUN yum -y install java-1.8.0-openjdk.x86_64
## User and work directory
RUN groupadd -r user && useradd -mr -g user -d /home/user -s /sbin/nologin -c "Docker image user" user
RUN echo "user ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
ENV HOME /home/user
## Change user to user
USER user
ENV USER user
RUN sh -c "$(curl -fsSL https://raw.githubusercontent.com/Linuxbrew/install/master/install.sh)" && echo 'export PATH="/home/linuxbrew/.linuxbrew/bin:$PATH"' >>~/.profile
## GNU parallel 10 seconds installation
#WORKDIR $HOME/tools/parallel
#RUN (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
# RUN brew install gcc
ENV PATH "/home/linuxbrew/.linuxbrew/bin:$PATH"
RUN brew install parallel
# Pyenv
WORKDIR $HOME
RUN git clone git://github.com/yyuu/pyenv.git .pyenv
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
# Python3
RUN pyenv install 3.6.5
RUN pyenv global 3.6.5
RUN pyenv rehash
# Python3 modules
RUN pip install --upgrade pip
RUN pip install --upgrade awscli pandas scipy numpy kneed
# STAR
RUN mkdir -p $HOME/tools/STAR
WORKDIR $HOME/tools/STAR
RUN wget https://github.com/alexdobin/STAR/archive/2.6.1b.tar.gz && tar xvf 2.6.1b.tar.gz
# DropSeq
RUN mkdir -p $HOME/tools/DropSeq
WORKDIR $HOME/tools/DropSeq
RUN wget https://github.com/broadinstitute/Drop-seq/releases/download/v1.13/Drop-seq_tools-1.13.zip && unzip Drop-seq_tools-1.13.zip
# Reference and other files should be downloaded during execution
RUN mkdir -p $HOME/data
RUN mkdir -p $HOME/results
COPY --chown=user:user code /home/user/code
# Copy main files and set entrypoint
WORKDIR /tmp
ADD fetch_and_run.sh /usr/local/bin/fetch_and_run.sh
USER nobody
ENTRYPOINT ["/usr/local/bin/fetch_and_run.sh"]
# To debug
# ENTRYPOINT ["/bin/bash"]
答案 0 :(得分:0)
罪魁祸首是工作定义(在AWS控制台中,请参阅here中的“创建工作定义”)。
对于ECR存储库URI,我忘记了使用更新后的图像(例如012345678901.dkr.ecr.us-east-1.amazonaws.com/awsbatch/fetch_and_run
)的URI,而是使用默认的amazonlinux
图像。
主要提示是我能够在本地运行它。