snakemake总是在第44行报告“ MissingOutputException”,5秒后丢失文件:

时间:2019-04-27 06:48:17

标签: pipeline snakemake rna-seq

snakemake总是在我的RNAs-seq管道中得到相同的错误报告:

MissingOutputException in line 44 of /root/s/r/snakemake/my_rnaseq_data/Snakefile:
Missing files after 5 seconds:
03_align/wt2.bam
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.

这是我的Snakefile:

SBT=["wt1","wt2","epcr1","epcr2"]

rule all:
    input:
        expand("02_clean/{nico}_1.paired.fq", nico=SBT),
        expand("02_clean/{nico}_2.paired.fq", nico=SBT),
        expand("03_align/{nico}.bam", nico=SBT)

rule trim:
    input:
        "01_raw/{nico}_1.fastq",
        "01_raw/{nico}_2.fastq"
    output:
        "02_clean/{nico}_1.paired.fq.gz",
        "02_clean/{nico}_1.unpaired.fq.gz",
        "02_clean/{nico}_2.paired.fq.gz",
        "02_clean/{nico}_2.unpaired.fq.gz",
    shell:
        "java -jar /software/Trimmomatic-0.36/trimmomatic-0.36.jar PE -threads 16 {input[0]} {input[1]} {output[0]} {output[1]} {output[2]} {output[3]} ILLUMINACLIP:/software/Trimmomatic-0.36/adapters/TruSeq3-PE-2.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 &"

rule gzip:
    input:
        "02_clean/{nico}_1.paired.fq.gz",
        "02_clean/{nico}_2.paired.fq.gz"
    output:
        "02_clean/{nico}_1.paired.fq",
        "02_clean/{nico}_2.paired.fq"
    run:
        shell("gzip -d {input[0]} > {output[0]}")
        shell("gzip -d {input[1]} > {output[1]}")

rule map:
    input:
        "02_clean/{nico}_1.paired.fq",
        "02_clean/{nico}_2.paired.fq"
    output:
        "03_align/{nico}.sam"
    log:
        "logs/map/{nico}.log"
    threads: 40
    shell:
        "hisat2 -p 20 --dta -x /root/s/r/p/A_th/WT-Al_VS_WT-CK/index/tair10 -1 {input[0]} -2 {input[1]} -S {output} >{log} 2>&1 &"

rule sort2bam:
    input:
        "03_align/{nico}.sam"
    output:
        "03_align/{nico}.bam"
    threads:30
    shell:
        "samtools sort -@ 20 -m 20G -o {output} {input} &"

一切正常,直到我添加“ rule sort2bam”部分。

我空运行时,一切正常。但是当我执行它时,它会报告问题所描述的错误。令人惊讶的是,它在报告报告被卡在后台的情况下运行了该任务。但是它总是运行一个任务,就像这样:

rule sort2bam:
    input: 03_align/epcr1.sam
    output: 03_align/epcr1.bam
    jobid: 11
    wildcards: nico=epcr1

Waiting at most 5 seconds for missing files.
MissingOutputException in line 45 of /root/s/r/snakemake/my_rnaseq_data/Snakefile:
Missing files after 5 seconds:
03_align/epcr1.bam
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
[Sat Apr 27 06:10:22 2019]
rule sort2bam:
    input: 03_align/wt1.sam
    output: 03_align/wt1.bam
    jobid: 9
    wildcards: nico=wt1

Waiting at most 5 seconds for missing files.
MissingOutputException in line 45 of /root/s/r/snakemake/my_rnaseq_data/Snakefile:
Missing files after 5 seconds:
03_align/wt1.bam
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

[Sat Apr 27 06:23:13 2019]
rule sort2bam:
    input: 03_align/wt2.sam
    output: 03_align/wt2.bam
    jobid: 6
    wildcards: nico=wt2

Waiting at most 5 seconds for missing files.
MissingOutputException in line 44 of /root/s/r/snakemake/my_rnaseq_data/Snakefile:
Missing files after 5 seconds:
03_align/wt2.bam
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

我不知道我的代码有什么问题?有理想吗?预先感谢!

2 个答案:

答案 0 :(得分:2)

您发现,&是问题所在。控制操作符&使您的命令在子shell中在后台运行,这使snakemake认为实际上是完整的作业。在您的情况下,似乎不需要使用它。

man bash开始使用&(从this answer盗取):

  

如果命令被控制操作员终止,则外壳   在子外壳程序中在后台执行命令。壳做   不要等待命令完成,并且          返回状态为0。

答案 1 :(得分:1)

我知道如何解决,但我不知道它为什么起作用! 只需删除

中的“&”
samtools sort -@ 20 -m 20G -o {output} {input} &