在染色体分散到checkpoint的步骤中,我需要在Snakemake中制作一个call copy number variants with GATK:
rule all:
input:
'aggregated/chr1'
# step that gives non-zero exit code error
checkpoint scattering:
input:
interval = 'gcfiltered_{chr}.interval_list'
output:
directory('scatter_{chr}')
shell:
'mkdir -p {output} && '
'gatk --java-options "-Xmx8G" IntervalListTools '
'--INPUT {input.interval} '
'--SUBDIVISION_MODE INTERVAL_COUNT '
'--SCATTER_CONTENT 600 '
'--OUTPUT {output}'
def aggregate_scatter(wildcards):
checkpoint_output = checkpoints.scattering.get(**wildcards).output[0]
return expand('scatter_{chr}/{i}/scattered.interval_list',
chr=wildcards.chr,
i=glob_wildcards(os.path.join(checkoint_output, '{i}/scattered.interval_list')).i)
# dummy rule to check if scattered files will be aggregated:
rule aggregate:
input:
aggregate_scatter
output:
"aggregated/{chr}"
shell:
"cat {input} > {output}"
但是scattering
规则失败。尽管输出正确产生,但它会给出错误(exited with non-zero exit code)
并删除输出。我尝试添加|| true
,但没有用。
但是,当我在snakemake之外运行命令时,它的退出状态为0时运行良好:
mkdir -p scatter_chr1 && gatk --java-options "-Xmx8G" IntervalListTools --INPUT gcfiltered_chr1.interval_list --SUBDIVISION_MODE INTERVAL_COUNT --SCATTER_CONTENT 600 --OUTPUT scatter_chr1
echo $?
我使用Snakemake 5.5.4和GATK 4.1.2.0。输入文件示例(gcfiltered_chr1.interval_list
):
@HD VN:1.6
@SQ SN:chr1 LN:122678785 UR:file:canFam3.fa M5:e4671b339daa96b7f11eb0b68fd999d8
chr1 100000 100999 + .
chr1 101000 101999 + .
chr1 102000 102999 + .
chr1 103000 103999 + .
chr1 104000 104999 + .
chr1 105000 105999 + .
chr1 106000 106999 + .
chr1 107000 107999 + .
chr1 108000 108999 + .
chr1 109000 109999 + .
chr1 110000 110999 + .
chr1 111000 111999 + .
chr1 112000 112999 + .
chr1 113000 113999 + .
chr1 114000 114999 + .
chr1 115000 115999 + .
chr1 116000 116999 + .
chr1 117000 117999 + .
chr1 118000 118999 + .
chr1 119000 119999 + .
chr1 120000 120999 + .