slurmd无法启动

时间:2020-03-19 12:04:54

标签: hpc slurm

我想在笔记本电脑的Slurm上运行计算服务器。我正在尝试按照文档(https://slurm.schedmd.com/archive/slurm-18.08.5/quickstart_admin.html)中所述运行slurm:

systemd (optional): enable the appropriate services on each system:
Controller: systemctl enable slurmctld
Database: systemctl enable slurmdbd
Compute Nodes: systemctl enable slurmd 

slurmctld和slurmdbd已启动:

● slurmctld.service - Slurm controller daemon
   Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2020-03-19 18:51:54 +07; 12s ago
     Docs: man:slurmctld(8)
  Process: 23372 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 23374 (slurmctld)
    Tasks: 11
   Memory: 1.8M
   CGroup: /system.slice/slurmctld.service
           └─23374 /usr/sbin/slurmctld

● slurmdbd.service - Slurm DBD accounting daemon
   Loaded: loaded (/lib/systemd/system/slurmdbd.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2020-03-19 18:52:28 +07; 10s ago
     Docs: man:slurmdbd(8)
  Process: 23409 ExecStart=/usr/sbin/slurmdbd $SLURMDBD_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 23411 (slurmdbd)
    Tasks: 1
   Memory: 1.2M
   CGroup: /system.slice/slurmdbd.service
           └─23411 /usr/sbin/slurmdbd

但是slurmd在启动时抛出错误:

● slurmd.service - Slurm node daemon
   Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Thu 2020-03-19 18:52:56 +07; 1s ago
     Docs: man:slurmd(8)
  Process: 23428 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=1/FAILURE)

我还想知道为什么slurmctld显示11个任务正在运行:

Tasks: 11

也许您可以帮助我,但是NODELIST(REASON)的值为(Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions)

我通过sudo启动

sudo systemctl start slurmd

0 个答案:

没有答案