如何解决有关samtools_index

时间:2019-06-04 02:47:54

标签: python shell bioinformatics snakemake qsub

我写了一条从fastp到gatk_bqsr的snakemake流,然后我在index_bqsr规则中遇到了这个问题:

Submitted job 440 with external jobid 'Your job 870 ("snakejob.mpileup.440.sh") has been submitted'.
Waiting at most 5 seconds for missing files.
MissingOutputException in **line 179** of s01_prepare_analysis.py:
Missing files after 5 seconds:
gatk_bqsr/S54.bqsr.bam.bai

这可能是由于文件系统延迟。如果是这种情况,请考虑使用--latency-wait增加等待时间。 我正确地确定此规则(index_bqsr)中的输出和输入正确,那么发生了什么?

这是centos7 python3.5:

  1. 我重新运行流程,没有任何改变,然后一切正常。
  2. 我删除了gatk_bqsr/S54.bqsr.bam.bai,然后重新引导流程,然后遇到相同的错误。
  3. 我提取了相关的代码块,然后进行了测试。我无法重复该错误。
164 rule gatk_bqsr2:
165     input:
166         "bwa/{sample}.rmdup.bam",
167         "gatk_bqsr/{sample}.recal.table"
168     output:
169         "gatk_bqsr/{sample}.bqsr.bam"
170     log:
171         "log/gatk_bqsr/{sample}.bqsr.log"
172     params:
173         "--java-options \"-Xmx15g -Djava.io.tmpdir=tmp\""
174     shell:
175         "{GATK} {params} ApplyBQSR "
176         "--bqsr-recal-file {input[1]} -R {REFERENCE} "
177         "-I {input[0]} -O {output} 1>{log} 2>&1"
178
179 rule index_bqsr:
180     input:
181         "gatk_bqsr/{sample}.bqsr.bam"
182     output:
183         "gatk_bqsr/{sample}.bqsr.bam.bai"
184     shell:
185         "{SAMtools} index {input}"
186
187 rule bedtools_rawbam:
188     input:
189         "bwa/{sample}.sort.bam",
190         "gatk_bqsr/{sample}.bqsr.bam.bai"
191     output:
192         "bedtools/{sample}.sort.bedgraph"
193     log:
194         "log/bedtools/{sample}.sort.bedgraph.log"
195     shell:
196         "{BEDtools} genomecov -ibam {input[0]} -bga > "
197         "{output} 1>{log} 2>&1"
  1. 原始错误消息是:
[Mon Jun  3 21:13:43 2019]
rule index_bqsr:
    input: gatk_bqsr/S54.bqsr.bam
    output: gatk_bqsr/S54.bqsr.bam.bai
    jobid: 541
    wildcards: sample=S54*

*Waiting at most 5 seconds for missing files.
MissingOutputException in line 179 of s01_prepare_analysis.py:
Missing files after 5 seconds:
gatk_bqsr/S54.bqsr.bam.bai

这可能是由于文件系统延迟所致。如果是这种情况,请考虑使用--latency-wait来增加等待时间。

  1. 我提取了相关的代码块,然后进行了测试,一切正常。测试代码为:
    SAMtools = "/home/my/anaconda2/bin/samtools"

    SAMPLE = ["S54",]

    rule all:
        input:
            expand("log/bedtools/{sample}.sort.bedgraph.log",sample=SAMPLE)

    rule gatk_bqsr2:
        input:
            "bwa/{sample}.rmdup.bam",
            "gatk_bqsr/{sample}.recal.table"
        output:
            "gatk_bqsr/{sample}.bqsr.bam"
        log:
            "log/gatk_bqsr/{sample}.bqsr.log"
        params:
            "--java-options \"-Xmx15g -Djava.io.tmpdir=tmp\""
        shell:
            "echo \"{GATK} {params} ApplyBQSR "
            "--bqsr-recal-file {input[1]} -R {REFERENCE} "
            "-I {input[0]} -O {output}\""

    rule index_bqsr:
        input:
            "gatk_bqsr/{sample}.bqsr.bam"
        output:
            "gatk_bqsr/{sample}.bqsr.bam.bai"
        shell:
            "{SAMtools} index {input}"

    rule bedtools_rawbam:
        input:
            "bwa/{sample}.sort.bam",
            "gatk_bqsr/{sample}.bqsr.bam.bai"
        output:
            "log/bedtools/{sample}.sort.bedgraph.log"
        log:
            "log/bedtools/{sample}.sort.bedgraph.log"
        shell:
            "echo \"bedtools genomecov -ibam {input[0]} -bga > "
            "{output}\" > {log}"

在此处补充DAG:

enter image description here

0 个答案:

没有答案