我对snakemake文件有这个规则。当我启动输入文件时,从我的yaml文件中的所有输入填充。我希望为每个bwa过程填充一个单位密钥。 这里有规则和Yaml文件(不完整)和干运行结果。
rule bwa_mem:
input:
dt=expand("trim/{sample}/",sample=config['units']),
forward_paired=expand("trim/{sample}/{sample}_forward_paired.fq.gz",sample=config['units']),
reverse_paired=expand("trim/{sample}/{sample}_reverse_paired.fq.gz",sample=config['units']),
forward_unpaired=expand("trim/{sample}/{sample}_forward_unpaired.fq.gz",sample=config['units']),
reverse_unpaired=expand("trim/{sample}/{sample}_reverse_unpaired.fq.gz",sample=config['units']),
output:
temp("mapped_reads/sam/{unit}.sam")
params:
genome= config["reference"]['genome_fasta']
log:
"mapped_reads/log/{unit}_bwa_mem.log"
benchmark:
"benchmarks/bwa/mem/{unit}.txt"
threads: 8
shell:
'/illumina/software/PROG2/bwa-0.7.15/bwa mem '\
'-t {threads} {params.genome} {input.forward_paired} {input.reverse_paired} {input.forward_unpaired} {input.reverse_unpaired} 2> {log} > {output}'
这个yaml文件配置:
'samples':
'432':
- '432_L001'
- '432_L002'
'433':
- '433_L002'
- '433_L001'
'434':
- '434_L001'
- '434_L002'
'435':
- '435_L002'
- '435_L001'
....
'units':
'432_L001':
- '/illumina/runs/FASTQ/RAW/432_CGATGT_L001_R1_001.fastq.gz'
- '/illumina/runs/FASTQ/RAW/432_CGATGT_L001_R2_001.fastq.gz'
'432_L002':
- '/illumina/runs/FASTQ/RAW/432_CGATGT_L002_R1_001.fastq.gz'
- '/illumina/runs/FASTQ/RAW/432_CGATGT_L002_R2_001.fastq.gz'
'433_L001':
- '/illumina/runs/FASTQ/RAW/433_CAGATC_L001_R1_001.fastq.gz'
- '/illumina/runs/FASTQ/RAW/433_CAGATC_L001_R2_001.fastq.gz'
'433_L002':
- '/illumina/runs/FASTQ/RAW/433_CAGATC_L002_R1_001.fastq.gz'
- '/illumina/runs/FASTQ/RAW/433_CAGATC_L002_R2_001.fastq.gz'
'434_L001':
- '/illumina/runs/FASTQ/RAW/434_GTGAAA_L001_R1_001.fastq.gz'
- '/illumina/runs/FASTQ/RAW/434_GTGAAA_L001_R2_001.fastq.gz'
'434_L002':
- '/illumina/runs/FASTQ/RAW/434_GTGAAA_L002_R1_001.fastq.gz'
- '/illumina/runs/FASTQ/RAW/434_GTGAAA_L002_R2_001.fastq.gz'
'435_L001':
- '/illumina/runs/FASTQ/RAW/435_ACAGTG_L001_R1_001.fastq.gz'
- '/illumina/runs/FASTQ/RAW/435_ACAGTG_L001_R2_001.fastq.gz'
当我尝试跑步时,他的bwa命令给出了这个结果
rule bwa_mem:
input: trim/432_L001/432_L001_reverse_unpaired.fq.gz, trim/432_L002/4
32_L002_reverse_unpaired.fq.gz, trim/433_L001/433_L001_reverse_unpaired.f
q.gz, trim/433_L002/433_L002_reverse_unpaired.fq.gz, trim/434_L001/434_L0
01_reverse_unpaired.fq.gz, trim/434_L002/434_L002_reverse_unpaired.fq.gz,
trim/435_L001/435_L001_reverse_unpaired.fq.gz, trim/435_L002/435_L002_re
verse_unpaired.fq.gz, trim/436_L001/436_L001_reverse_unpaired.fq.gz, trim
/436_L002/436_L002_reverse_unpaired.fq.gz, trim/437_L001/437_L001_reverse
_unpaired.fq.gz, trim/437_L002/437_L002_reverse_unpaired.fq.gz, trim/438_
L003/438_L003_reverse_unpaired.fq.gz, trim/438_L004/438_L004_reverse_unpa
ired.fq.gz, trim/lane1_L001/lane1_L
001_reverse_paired.fq.gz, trim/lane2_L002/lane2_L002_reverse_paired.fq.gz
, trim/lane8_L008/
output: mapped_reads/sam/441_L004.sam
log: mapped_reads/log/441_L004_bwa_mem.log
jobid: 208
benchmark: benchmarks/bwa/mem/441_L004.txt
wildcards: unit=441_L004
对于单位上的任何元素报告所有输入文件...我犯了哪些错误?
答案 0 :(得分:2)
因此,您在此处执行的操作是通过expand函数将所有这些文件定义为规则的输入文件。换句话说,您在此处执行聚合。你真正想要的是只有特定样本的输入文件集。您只需不使用输入文件的扩展功能即可实现此目的。这里没有理由使用它。
我强烈建议您阅读整个官方Snakemake教程,该教程也涵盖了这类问题:http://snakemake.readthedocs.io/en/stable/tutorial/tutorial.html