snakemake:泊坞窗内的MissingOutputException

时间:2018-11-05 10:00:53

标签: docker snakemake

我正在尝试使用snakemake在docker中运行管道。我在使用sortmerna工具从{sample}_merged_sorted_mRNA{sample}_merged_sorted输入文件生成control_merged.fqtreated_merged.fq输出时遇到问题。

这是我的Snakefile:

   SAMPLES = ["control","treated"]
   for smp in SAMPLES:
       print("Sample " + smp + " will be processed")
  rule final:
       input:
          expand('/output/{sample}_merged.fq', sample=SAMPLES),
          expand('/output/{sample}_merged_sorted', sample=SAMPLES),
          expand('/output/{sample}_merged_sorted_mRNA', sample=SAMPLES),

  rule sortmerna:
       input: '/output/{sample}_merged.fq',

       output: merged_file='/output/{sample}_merged_sorted_mRNA', merged_sorted='/output/{sample}_merged_sorted',

   message: """---SORTING---"""
   shell:
      '''
         sortmerna --ref /usr/share/sortmerna/rRNA_databases/silva-bac-23s-id98.fasta,/    usr/share/sortmerna/rRNA_databases/index/silva-bac-23s-id98: --reads {input} --paired_in     -a 16 --log --fastx --aligned {output.merged_file} --other {output.merged_sorted} -v
     '''

在运行时,我得到:

Waiting at most 5 seconds for missing files.                                                 
 MissingOutputException in line 57 of /input/Snakefile:                                       
 Missing files after 5 seconds:
/output/control_merged_sorted_mRNA
/output/control_merged_sorted  

 This might be due to filesystem latency. If that is the case, consider to increase the wait $ime with --latency-wait.

 Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /input/.snakemake/log/2018-11-05T091643.911334.snakemake.log

我尝试使用--latency-wait增加延迟,但得到的结果相同。有趣的是,生成了两个输出文件control_merged_sorted_mRNA.fqcontrol_merged_sorted.fq,但是程序失败并退出。 snakemake的版本是5.3.0。有什么帮助吗?

1 个答案:

答案 0 :(得分:0)

snakemake失败,因为未生成规则sortmerna描述的输出。这不是延迟问题,而是您的输出问题。

您的规则sortmerna应作为输出:
/output/control_merged_sorted_mRNA

/output/control_merged_sorted
但是您正在使用的程序(我对sortmerna一无所知)显然正在产生
/output/control_merged_sorted_mRNA.fq

/output/control_merged_sorted.fq
确保在程序的命令行上指定选项--aligned--other时,它应该是所生成文件的真实名称,或者仅是基本名称,并且程序将添加一个后缀.fq。如果是后一种情况,建议您使用:

rule final:
    input:
      expand('/output/{sample}_merged.fq', sample=SAMPLES),
      expand('/output/{sample}_merged_sorted', sample=SAMPLES),
      expand('/output/{sample}_merged_sorted_mRNA', sample=SAMPLES),

rule sortmerna:
   input: 
       '/output/{sample}_merged.fq',
   output: 
       merged_file='/output/{sample}_merged_sorted_mRNA.fq',
       merged_sorted='/output/{sample}_merged_sorted.fq'
   params: 
       merged_file_basename='/output/{sample}_merged_sorted_mRNA',
       merged_sorted_basename='/output/{sample}_merged_sorted'
   message: """---SORTING---"""
   shell:
       """
       sortmerna --ref /usr/share/sortmerna/rRNA_databases/silva-bac-23s-id98.fasta,/usr/share/sortmerna/rRNA_databases/index/silva-bac-23s-id98: --reads {input} --paired_in     -a 16 --log --fastx --aligned {params.merged_file_basename} --other {params.merged_sorted_basename} -v
       """