我无法使用mem属性提交作业。由于我是新手,谷歌两天后,我在这里寻求帮助。任何建议都会感激不尽!
以下是我的所作所为:
\ 1。提交我的剧本:
qsub -S /bin/bash -A assembly -pe threads 16 -l mem=2GB -cwd -N "pBcR_correct_asm" -j y -o /dev/null runCorrection.sh
Unable to run job: unknown resource "mem".
Exiting.
\ 2。考虑到我已将“h”替换为“host”,根据SGE unknown resource "nodes"解决了我的问题,我将“m”替换为“mem”,但它不起作用。
\ 3。谷歌之后,我知道“h”是在“/ opt / gridengine / util / resources / centry /”中定义的快捷方式 主机名“,可以通过”qconf -sc“确认:
qconf -sc
#name shortcut type relop requestable consumable default urgency
#----------------------------------------------------------------------------------------
arch a RESTRING == YES NO NONE 0
calendar c RESTRING == YES NO NONE 0
cpu cpu DOUBLE >= YES NO 0 0
display_win_gui dwg BOOL == YES NO 0 0
h_core h_core MEMORY <= YES NO 0 0
h_cpu h_cpu TIME <= YES NO 0:0:0 0
h_data h_data MEMORY <= YES NO 0 0
h_fsize h_fsize MEMORY <= YES NO 0 0
h_rss h_rss MEMORY <= YES NO 0 0
h_rt h_rt TIME <= YES NO 0:0:0 0
h_stack h_stack MEMORY <= YES NO 0 0
h_vmem h_vmem MEMORY <= YES NO 0 0
hostname h HOST == YES NO NONE 0
load_avg la DOUBLE >= NO NO 0 0
load_long ll DOUBLE >= NO NO 0 0
load_medium lm DOUBLE >= NO NO 0 0
load_short ls DOUBLE >= NO NO 0 0
m_core core INT <= YES NO 0 0
m_socket socket INT <= YES NO 0 0
m_topology topo RESTRING == YES NO NONE 0
m_topology_inuse utopo RESTRING == YES NO NONE 0
mem_free mf MEMORY <= YES NO 0 0
mem_total mt MEMORY <= YES NO 0 0
mem_used mu MEMORY >= YES NO 0 0
\ 4。因此我将“mt”替换为“mem”,但它抱怨了属性问题。根据上面的输出,似乎mem_total几乎与之前工作的“hostname”相同。然后,我认为在通过SGE指南后jsv可能是一个问题,但是我找不到任何包含“无法运行作业:属性......”的脚本,这些脚本位于“/ opt / gridengine”的导演下/ UTIL /资源/ JSV”。我想我必须配置一些文件,但这些文件是什么,我应该怎么做?
qsub -S /bin/bash -A assembly -pe threads 16 -l mt=2GB -cwd -N "pBcR_correct_asm" -j y -o test.out runCorrection.sh
Unable to run job: attribute "mem_total" is not a memory value.
Exiting.
答案 0 :(得分:1)
@Vince!
非常感谢您的回复。
最后我解决了我的问题,使用“h_vmem = 2g”(“2GB”会给出错误),但我不知道在哪里可以找到如何设计复合体的值(MEMORY)。
现在没有必要提供以下信息。
我已经阅读了您提供的网站,并将复杂的h_vmem和s_vmeme的属性配置为“耗材”,但它不起作用。我想我必须配置队列的“complex_value”,目前是“NONE”。但是,我无法打开可能告诉我如何配置的网络http://gridscheduler.sourceforge.net/htmlman/htmlman5/sge_types.html?pathrev=V62u5_TAG。我是否正确配置配置队列?我是否也必须配置主机?
任何建议都会感激不尽!
以下是我的所作所为:
\ 1。对于h_vmem和s_vmem,将耗材的属性更改为“YES”:
qconf -sc
#name shortcut type relop requestable consumable default urgency
#----------------------------------------------------------------------------------------
arch a RESTRING == YES NO NONE 0
calendar c RESTRING == YES NO NONE 0
cpu cpu DOUBLE >= YES NO 0 0
display_win_gui dwg BOOL == YES NO 0 0
h_core h_core MEMORY <= YES NO 0 0
h_cpu h_cpu TIME <= YES NO 0:0:0 0
h_data h_data MEMORY <= YES NO 0 0
h_fsize h_fsize MEMORY <= YES NO 0 0
h_rss h_rss MEMORY <= YES NO 0 0
h_rt h_rt TIME <= YES NO 0:0:0 0
h_stack h_stack MEMORY <= YES NO 0 0
h_vmem h_vmem MEMORY <= YES YES 0 0
hostname h HOST == YES NO NONE 0
load_avg la DOUBLE >= NO NO 0 0
load_long ll DOUBLE >= NO NO 0 0
load_medium lm DOUBLE >= NO NO 0 0
load_short ls DOUBLE >= NO NO 0 0
m_core core INT <= YES NO 0 0
m_socket socket INT <= YES NO 0 0
m_topology topo RESTRING == YES NO NONE 0
m_topology_inuse utopo RESTRING == YES NO NONE 0
mem_free mf MEMORY <= YES NO 0 0
mem_total mt MEMORY <= YES NO 0 0
mem_used mu MEMORY >= YES NO 0 0
min_cpu_interval mci TIME <= NO NO 0:0:0 0
np_load_avg nla DOUBLE >= NO NO 0 0
np_load_long nll DOUBLE >= NO NO 0 0
np_load_medium nlm DOUBLE >= NO NO 0 0
np_load_short nls DOUBLE >= NO NO 0 0
num_proc p INT == YES NO 0 0
qname q RESTRING == YES NO NONE 0
rerun re BOOL == NO NO 0 0
s_core s_core MEMORY <= YES NO 0 0
s_cpu s_cpu TIME <= YES NO 0:0:0 0
s_data s_data MEMORY <= YES NO 0 0
s_fsize s_fsize MEMORY <= YES NO 0 0
s_rss s_rss MEMORY <= YES NO 0 0
s_rt s_rt TIME <= YES NO 0:0:0 0
s_stack s_stack MEMORY <= YES NO 0 0
s_vmem s_vmem MEMORY <= YES YES 0 0
seq_no seq INT == NO NO 0 0
slots s INT <= YES YES 1 1000
swap_free sf MEMORY <= YES NO 0 0
swap_rate sr MEMORY >= YES NO 0 0
swap_rsvd srsv MEMORY >= YES NO 0 0
swap_total st MEMORY <= YES NO 0 0
swap_used su MEMORY >= YES NO 0 0
tmpdir tmp RESTRING == NO NO NONE 0
virtual_free vf MEMORY <= YES NO 0 0
virtual_total vt MEMORY <= YES NO 0 0
virtual_used vu MEMORY >= YES NO 0 0
# >#< starts a comment but comments are not saved across edits --------
\ 2。将我的工作提交到smp.q队列,它抱怨同样的问题:
qsub -S /bin/bash -A assembly -q smp.q -pe newPe 16 -l h_vmem=2GB -cwd -N "pBcR_correct_asm" -j y -o runCorrection.sh
Unable to run job: attribute "h_vmem" is not a memory value.
Exiting.
\ 3。 smp.q.的信息我认为应该改变“complex_values”并且“h_vmem”可以保持不变:
qconf -sq smp.q
qname smp.q
hostlist @smp.q
seq_no 0
load_thresholds np_load_avg=1.75
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors UNDEFINED
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list make newPe
rerun FALSE
slots 160
tmpdir /tmp
shell /bin/csh
prolog NONE
epilog NONE
shell_start_mode posix_compliant
starter_method NONE
suspend_method NONE
resume_method NONE
terminate_method NONE
notify 00:00:60
owner_list NONE
user_lists NONE
xuser_lists NONE
subordinate_list NONE
complex_values NONE
projects NONE
xprojects NONE
calendar NONE
initial_state default
s_rt INFINITY
h_rt INFINITY
s_cpu INFINITY
h_cpu INFINITY
s_fsize INFINITY
h_fsize INFINITY
s_data INFINITY
h_data INFINITY
s_stack INFINITY
h_stack INFINITY
s_core INFINITY
h_core INFINITY
s_rss INFINITY
h_rss INFINITY
s_vmem INFINITY
h_vmem INFINITY
\ 4。 @ smp.q中主机的信息:
qconf -sconf smp03.local
#smp03.local:
mailer /bin/mail
xterm /usr/bin/X11/xterm
execd_spool_dir /opt/gridengine/default/spool
\ 5。全球信息。我在这里添加了h_vmem和s_vmem吗?
qconf -sconf
#global:
execd_spool_dir /opt/gridengine/default/spool
mailer /bin/mail
xterm /usr/bin/X11/xterm
load_sensor none
prolog none
epilog none
shell_start_mode posix_compliant
login_shells sh,ksh,csh,tcsh
min_uid 0
min_gid 0
user_lists none
xuser_lists none
projects none
xprojects none
enforce_project false
enforce_user auto
load_report_time 00:00:40
max_unheard 00:05:00
reschedule_unknown 00:00:00
loglevel log_warning
administrator_mail none
set_token_cmd none
pag_cmd none
token_extend_time none
shepherd_cmd none
qmaster_params none
execd_params ENABLE_ADDGRP_KILL=TRUE H_MEMORYLOCKED=infinity
reporting_params accounting=true reporting=true \
flush_time=00:00:15 joblog=true sharelog=00:00:00
finished_jobs 100
gid_range 20000-20100
qlogin_command builtin
qlogin_daemon builtin
rlogin_command builtin
rlogin_daemon builtin
rsh_command builtin
rsh_daemon builtin
max_aj_instances 2000
max_aj_tasks 75000
max_u_jobs 0
max_jobs 0
max_advance_reservations 0
auto_user_oticket 0
auto_user_fshare 0
auto_user_default_project none
auto_user_delete_time 86400
delegated_file_staging false
reprioritize 0
jsv_url none
jsv_allowed_mod ac,h,i,e,o,j,M,N,p,w
答案 1 :(得分:0)
您可能想要的是h_vmem
。至少这是我总是使用的属性来指定我想要的作业请求的内存。
请参阅:
http://gridscheduler.sourceforge.net/htmlman/htmlman5/queue_conf.html?pathrev=V62u5_TAG
具体地,
The resource limit parameters s_vmem and h_vmem are imple-
mented by Sun Grid Engine as a job limit. They impose a
limit on the amount of combined virtual memory consumed by
all the processes in the job. If h_vmem is exceeded by a job
running in the queue, it is aborted via a SIGKILL signal
(see kill(1)). If s_vmem is exceeded, the job is sent a
SIGXCPU signal which can be caught by the job. If you wish
to allow a job to be "warned" so it can exit gracefully
before it is killed then you should set the s_vmem limit to
a lower value than h_vmem. For parallel processes, the
limit is applied per slot which means that the limit is mul-
tiplied by the number of slots being used by the job before
being applied.
此外,您可能需要使用qconf
将其设置为耗材。