带有slurm srun的docker:错误:无法为作业xxxxxx创建步骤:生成作业凭证时出错

时间:2020-06-08 03:33:04

标签: linux docker centos slurm

我正在使用docker映像将srun / sbatch作业提交到slurm网格中。 由于某些限制,我必须将/etc/munge/munge.key以及所有的可执行文件都打包到一个centos 7 docker映像中,并在其中安装munge。

完成此操作后,我尝试运行srun / sbatch并发现了此类问题。我找不到其他日志以获取更多详细信息。

Docker文件就像:

FROM centos:7

RUN groupadd -g 802 slurm && useradd -g slurm -u 802 slurm -d /opt/slurm -s /bin/bash
RUN groupadd -g 990 munge && useradd -g munge -u 993 munge -d /etc/munge -s /sbin/nologin


#RUN ls -l /etc/pki/rpm-gpg /usr/share/rhel/secrets/rpm-gpg
#RUN yum -y install epel-release && yum -y clean all

RUN rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

RUN \
  yum -y install openssh-clients openssh-server && \
  yum -y clean all && \
  touch /run/utmp && \
  chmod u+s /usr/bin/ping && \
  sed -i 's|session    required     pam_loginuid.so|session    optional     pam_loginuid.so|g' /etc/pam.d/sshd && \
  mkdir -p /var/run/sshd  

RUN \
  yum install -y \
  java-1.8.0-openjdk \
  java-1.8.0-openjdk-devel

RUN \
  yum -y install gtk2  gtk-devel  munge  munge-devel && \
  yum -y clean all


RUN \
  yum groupinstall -y "Development Tools"

RUN \ 
  adduser -m jenkins && \
  echo "jenkins:jenkins" | chpasswd && \
  mkdir /home/jenkins/.m2


# remove all munge storage and auth dir
RUN rm -rf /etc/munge /var/run/munge /var/lib/munge /var/log/munge

COPY entrypoint.sh /
COPY .ssh/authorized_keys /home/jenkins/.ssh/authorized_keys

RUN ssh-keygen -A


ENV JAVA_HOME /etc/alternatives/jre
ENV DRMAA_LIBRARY_PATH /opt/drmaa/lib/libdrmaa.so
ENV PATH /opt/slurm/bin:/opt/slurm/sbin:/opt/drmaa/bin:$PATH
ENV LD_LIBRARY_PATH $LD_LIBRARY_PATH

VOLUME ["/var/lib/slurmd", "/var/spool/slurmd", "/var/log/slurm"]

EXPOSE 22
CMD ["/usr/sbin/sshd", "-D"]

音量就像:

/var/log/munge:/var/log/munge:rw
/etc/munge:/etc/munge:rw
/var/run/munge:/var/run/munge:rw
/var/lib/munge:/var/lib/munge:rw
/var/log/slurmd.log:/work/slurmd.log:rw
/opt/slurm:/opt/slurm:rw
/opt/drmaa:/opt/drmaa:rw

entrypints仅用于启动munge服务和sshd服务:

#!/bin/sh
ssh-keygen -A
munged
slurmd -c
exec /usr/sbin/sshd -D -e 

0 个答案:

没有答案
相关问题