我正在编写一个Snakemake管道,并且有一些规则通过qsub运行。这些规则之一称为run_MSA
,其代码为
rule run_MSA:
input:
file=model_dir + "/{projectID}_MODEL_{remove_lower_t}_{remove_higher_t}/{projectID}_start.fasta"
output:
alignment= model_dir + "/{projectID}_MODEL_{remove_lower_t}_{remove_higher_t}/align1.phy"
threads: cluster_config['run_MSA']['threads']
log:
"logs/quantizyme_model_MSA/{projectID}_filtering_{remove_lower_t}_{remove_higher_t}.log"
shell:
"""
clustalo \
--threads {threads} \
-i {input} \
-o {output.alignment} \
--outfmt=phy > {log} 2>&1
"""
cluster.yaml
文件是
__default__:
time: 1:00:00
threads: 20
run_MSA:
time: 1:00:00
threads: 18
run_MSA_subtree:
time: 1:00:00
threads: 10
subtreeing2:
time: 1:00:00
threads: 15
我遇到了一些奇怪的事情:当我使用-n
选项运行snakemake管道时,按照以下说明,一切似乎都很好:
snakemake -nrp \
--config projectID=ref_AA2 remove_lower_t=100 remove_higher_t=1500 remove_seqs=TRUE subtrees=3 \
-j 100 --cluster-config cluster.yaml \
--cluster 'qsub -V -l h_rt={cluster.time} -pe smp {cluster.threads} -cwd -j y' \
-s quantizyme_model.2.snakefile
Building DAG of jobs...
Job counts:
count jobs
1 all
1 clustering
1 compress_out_folder
1 run_MSA
3 run_MSA_subtree
3 subtreeing1
3 subtreeing2
13
[Wed Oct 24 18:22:59 2018]
rule run_MSA:
input: models/ref_AA2_MODEL_100_1500/ref_AA2_start.fasta
output: models/ref_AA2_MODEL_100_1500/align1.phy
log: logs/quantizyme_model_MSA/ref_AA2_filtering_100_1500.log
jobid: 12
reason: Missing output files: models/ref_AA2_MODEL_100_1500/align1.phy
wildcards: projectID=ref_AA2, remove_lower_t=100, remove_higher_t=1500
threads: 18
clustalo --threads 18 -i models/ref_AA2_MODEL_100_1500/ref_AA2_start.fasta -o models/ref_AA2_MODEL_100_1500/align1.phy --outfmt=phy > logs/quantizyme_model_MSA/ref_AA2_filtering_100_1500.log 2>&1
为简单起见,我仅报告规则run_MSA
的标准输出。但是,当我在不使用-n
选项的情况下运行同一命令行时,会出现此错误:
KeyError in line 65 of /nfs4/my-gridfront/mykopat-proj3/mykopat-metatrans/Quanty_test2/quantizyme_snakemake_test_3/quantizyme_model.2.snakefile:
'run_MSA'
File "/nfs4/my-gridfront/mykopat-proj3/mykopat-metatrans/Quanty_test2/quantizyme_snakemake_test_3/quantizyme_model.2.snakefile", line 65, in <module>
所以对我来说,当蛇形运行作为空运行运行时,似乎可以很好地读取相同的关键字,但是在实际执行过程中却不是。我想念什么?
非常感谢,
Domenico