我想在笔记本电脑的Slurm上运行计算服务器。我正在尝试按照文档(https://slurm.schedmd.com/archive/slurm-18.08.5/quickstart_admin.html)中所述运行slurm:
systemd (optional): enable the appropriate services on each system:
Controller: systemctl enable slurmctld
Database: systemctl enable slurmdbd
Compute Nodes: systemctl enable slurmd
slurmctld和slurmdbd已启动:
● slurmctld.service - Slurm controller daemon
Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2020-03-19 18:51:54 +07; 12s ago
Docs: man:slurmctld(8)
Process: 23372 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 23374 (slurmctld)
Tasks: 11
Memory: 1.8M
CGroup: /system.slice/slurmctld.service
└─23374 /usr/sbin/slurmctld
● slurmdbd.service - Slurm DBD accounting daemon
Loaded: loaded (/lib/systemd/system/slurmdbd.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2020-03-19 18:52:28 +07; 10s ago
Docs: man:slurmdbd(8)
Process: 23409 ExecStart=/usr/sbin/slurmdbd $SLURMDBD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 23411 (slurmdbd)
Tasks: 1
Memory: 1.2M
CGroup: /system.slice/slurmdbd.service
└─23411 /usr/sbin/slurmdbd
但是slurmd在启动时抛出错误:
● slurmd.service - Slurm node daemon
Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2020-03-19 18:52:56 +07; 1s ago
Docs: man:slurmd(8)
Process: 23428 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=1/FAILURE)
我还想知道为什么slurmctld显示11个任务正在运行:
Tasks: 11
也许您可以帮助我,但是NODELIST(REASON)
的值为(Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions)
。
我通过sudo启动
sudo systemctl start slurmd