当规则A是规则B的依赖项时,Snakemake拒绝解压缩输入函数,但是当规则A是最终规则时,Snakemake则接受它

时间:2019-03-05 16:45:04

标签: python-3.x bioinformatics snakemake

我有一个用于宏基因组学项目的snakemake工作流程。在工作流程中的某个时刻,我将DNA测序读数(单端或成对末端)映射到由同一工作流程制成的元基因组组件。我使输入函数符合Snakemake手册,以一条规则映射单端和成对的末端读数。像这样

import os.path
def get_binning_reads(wildcards):
    pathpe=("data/sequencing_binning_signals/" + wildcards.binningsignal + ".trimmed_paired.R1.fastq.gz")
    pathse=("data/sequencing_binning_signals/" + wildcards.binningsignal + ".trimmed.fastq.gz")
    if os.path.isfile(pathpe) == True :
      return {'reads' : expand("data/sequencing_binning_signals/{binningsignal}.trimmed_paired.R{PE}.fastq.gz", PE=[1,2],binningsignal=wildcards.binningsignal) }
    elif os.path.isfile(pathse) == True :
      return {'reads' : expand("data/sequencing_binning_signals/{binningsignal}.trimmed.fastq.gz", binningsignal=wildcards.binningsignal) }

rule backmap_bwa_mem:
  input:
    unpack(get_binning_reads),
    index=expand("data/assembly_{{assemblytype}}/{{hostcode}}/scaffolds_bwa_index/scaffolds.{ext}",ext=['bwt','pac','ann','sa','amb'])
  params:
    lambda w: expand("data/assembly_{assemblytype}/{hostcode}/scaffolds_bwa_index/scaffolds",assemblytype=w.assemblytype,hostcode=w.hostcode)
  output:
    "data/assembly_{assemblytype}_binningsignals/{hostcode}/{binningsignal}.bam"
  threads: 100
  log:
    stdout="logs/bwa_backmap_samtools_{assemblytype}_{hostcode}.stdout",
    samstderr="logs/bwa_backmap_samtools_{assemblytype}_{hostcode}.stdout",
    stderr="logs/bwa_backmap_{assemblytype}_{hostcode}.stderr"
  shell:
    "bwa mem -t {threads} {params} {input.reads} 2> {log.stderr} | samtools view -@ 12 -b -o {output}  2> {log.samstderr} > {log.stdout}"

当我像这样做出任意“全部规则”时,工作流成功运行。

rule allbackmapped:
  input:
    expand("data/assembly_{assemblytype}_binningsignals/{hostcode}/{binningsignal}.bam",       binningsignal=BINNINGSIGNALS,assemblytype=ASSEMBLYTYPES,hostcode=HOSTCODES)

但是,当像这样的后续规则需要此规则创建的文件时:

rule backmap_samtools_sort:
  input:
    "data/assembly_{assemblytype}_binningsignals/{hostcode}/{binningsignal}.bam"
  output:
    "data/assembly_{assemblytype}_binningsignals/{hostcode}/{binningsignal}.sorted.bam"
  threads: 6
  resources:
    mem_mb=5000
  shell:
"samtools sort -@ {threads} -m {mem_mb}M -o {output} {input}"
rule allsorted:
  input:
    expand("data/assembly_{assemblytype}_binningsignals/{hostcode}/{binningsignal}.sorted.bam",binningsignal=BINNINGSIGNALS,assemblytype=ASSEMBLYTYPES,hostcode=HOSTCODES)

工作流程因此错误而关闭

  

第416行的WorkflowError   / stor / azolla_metagenome / Azolla_genus_metagenome / Snakefile:只能   在列表和字典上使用unpack()

对我来说,此错误表明前一条规则的输入功能有问题。但是,当没有后续处理排队时,它似乎无法成功运行。

整个项目都托管在github上。整个Snakefile和一个github issue

0 个答案:

没有答案