我有一个用于宏基因组学项目的snakemake工作流程。在工作流程中的某个时刻,我将DNA测序读数(单端或成对末端)映射到由同一工作流程制成的元基因组组件。我使输入函数符合Snakemake手册,以一条规则映射单端和成对的末端读数。像这样
import os.path
def get_binning_reads(wildcards):
pathpe=("data/sequencing_binning_signals/" + wildcards.binningsignal + ".trimmed_paired.R1.fastq.gz")
pathse=("data/sequencing_binning_signals/" + wildcards.binningsignal + ".trimmed.fastq.gz")
if os.path.isfile(pathpe) == True :
return {'reads' : expand("data/sequencing_binning_signals/{binningsignal}.trimmed_paired.R{PE}.fastq.gz", PE=[1,2],binningsignal=wildcards.binningsignal) }
elif os.path.isfile(pathse) == True :
return {'reads' : expand("data/sequencing_binning_signals/{binningsignal}.trimmed.fastq.gz", binningsignal=wildcards.binningsignal) }
rule backmap_bwa_mem:
input:
unpack(get_binning_reads),
index=expand("data/assembly_{{assemblytype}}/{{hostcode}}/scaffolds_bwa_index/scaffolds.{ext}",ext=['bwt','pac','ann','sa','amb'])
params:
lambda w: expand("data/assembly_{assemblytype}/{hostcode}/scaffolds_bwa_index/scaffolds",assemblytype=w.assemblytype,hostcode=w.hostcode)
output:
"data/assembly_{assemblytype}_binningsignals/{hostcode}/{binningsignal}.bam"
threads: 100
log:
stdout="logs/bwa_backmap_samtools_{assemblytype}_{hostcode}.stdout",
samstderr="logs/bwa_backmap_samtools_{assemblytype}_{hostcode}.stdout",
stderr="logs/bwa_backmap_{assemblytype}_{hostcode}.stderr"
shell:
"bwa mem -t {threads} {params} {input.reads} 2> {log.stderr} | samtools view -@ 12 -b -o {output} 2> {log.samstderr} > {log.stdout}"
当我像这样做出任意“全部规则”时,工作流成功运行。
rule allbackmapped:
input:
expand("data/assembly_{assemblytype}_binningsignals/{hostcode}/{binningsignal}.bam", binningsignal=BINNINGSIGNALS,assemblytype=ASSEMBLYTYPES,hostcode=HOSTCODES)
但是,当像这样的后续规则需要此规则创建的文件时:
rule backmap_samtools_sort:
input:
"data/assembly_{assemblytype}_binningsignals/{hostcode}/{binningsignal}.bam"
output:
"data/assembly_{assemblytype}_binningsignals/{hostcode}/{binningsignal}.sorted.bam"
threads: 6
resources:
mem_mb=5000
shell:
"samtools sort -@ {threads} -m {mem_mb}M -o {output} {input}"
rule allsorted:
input:
expand("data/assembly_{assemblytype}_binningsignals/{hostcode}/{binningsignal}.sorted.bam",binningsignal=BINNINGSIGNALS,assemblytype=ASSEMBLYTYPES,hostcode=HOSTCODES)
工作流程因此错误而关闭
第416行的WorkflowError / stor / azolla_metagenome / Azolla_genus_metagenome / Snakefile:只能 在列表和字典上使用unpack()
对我来说,此错误表明前一条规则的输入功能有问题。但是,当没有后续处理排队时,它似乎无法成功运行。
整个项目都托管在github上。整个Snakefile和一个github issue。