我正在与HTCondor合作,我正在分配一个任务,该任务的Queue
属性为100,因此PC管理器有100个子任务,但是在激活Tearea时使用condor_submit
命令,我可以控制池的3台计算机中的一台变为活动状态,问题是当给condor_q
时,执行这100台计算机的那台只有一台计算机,但是在执行condor_status
命令以查看池中所有计算机的状态时,所有计算机均处于 Clamed 状态,并具有 Busy 活动,从理论上讲,这是正确的,但是正如我之前说的那样,当查看活动队列时,所有内容都指向单个池计算机。如果在其他计算机上运行相同的condor_q
命令,则不会显示作业队列。
我的问题是这是否正常,不是我应该做些什么来纠正它?
这是工作分配文件:
Universe = vanilla
Executable = test
Arguments = 8888888888
Should_transfer_files = yes
Log = test.$(Process).log
Output = test.$(Process).out
Error = text.$(Process).error
Priority = 20
Queue 100
这是池的PC管理员中的文件00debconf
# This is the DebConf-generated configuration for HTCondor
#
# DO NOT edit this file, as changes will be overwritten during package
# upgrades. Instead place custom configuration into either
# /etc/condor/condor_config.local or another file in /etc/condor/config.d Use
# the latter location if you need to overwrite/complement settings in the
# DebConf-generated configuration.
# which HTCondor daemons to run on this machine
DAEMON_LIST = COLLECTOR, NEGOTIATOR, SCHEDD, STARTD, MASTER
# who receives emails when something goes wrong
CONDOR_ADMIN = carlos@master
# how much memory should NOT be available to HTCondor
RESERVED_MEMORY =
# label to identify the local filesystem in a HTCondor pool
FILESYSTEM_DOMAIN = $(FULL_HOSTNAME)
# label to identify the user id of the system in a HTCondor pool
# (this need to be a fully qualified domain name)
UID_DOMAIN = $(FULL_HOSTNAME)
# which machine is the central manager of this HTCondor pool
CONDOR_HOST = 192.168.0.15
# what machines can access HTCondor daemons on this machine
ALLOW_WRITE = 192.168.0.*
ALLOW_NEGOTIATOR = $(CONDOR_HOST) $(IP_ADDRESS) 127.*
这是当前池的队列:
carlos@master:~/Documentos/condor.jobs.d$ condor_q
-- Schedd: master : <192.168.0.15:9618?... @ 06/01/19 17:16:30
OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
carlos CMD: test 6/1 16:48 86 3 11 100 13.86-99
14 jobs; 0 completed, 0 removed, 11 idle, 3 running, 0 held, 0 suspended
谢谢。