我在我的snakamake脚本中使用了expand和glob_wildcards。我的目录库中有两个子目录,它们的fastq正向和反向文件的结构如下:
libraries:
- MMETSP1:
- SRR19_1
- SRR19_2
- MMETSP2:
- SRR20_1
- SRR20_2
我正在使用expand的 zip选项在两个目录中获得所有可能的组合。但是我想要的只是带有逗号分隔的正向文件(例如:SRR19_1,SRR20_1)的列表,对于反向文件(例如:SRR19_2,SRR20_2)也是如此。
当我删除zip选项时,会出现此错误:
MissingInputException in line 159 of /projet/fr2424/sib/gniang/script/snakemake/scripts/dntap_test.py:
Missing input files for rule trinity:
/projet/fr2424/sib/gniang/script/snakemake/libraries/MMESTP1/SRR1300320_1.fastq.gz
/projet/fr2424/sib/gniang/script/snakemake/libraries/MMESTP1/SRR1300320_2.fastq.gz
/projet/fr2424/sib/gniang/script/snakemake/libraries/MMETSP2/SRR1300319_2.fastq.gz
/projet/fr2424/sib/gniang/script/snakemake/libraries/MMETSP2/SRR1300319_1.fastq.gz
我正在使用此脚本:
#!/usr/bin/python
# Imports
import os
import glob
import sys
# Get current working directory
dir_path = os.getcwd()
# User defined ouput directory
OUT_DIR = config["directories"]["outdir"]
# User defined input directory
LIBRARY_DIR = config["directories"]["libraries"]
# Relative output directories
TRINITY_DIR = OUT_DIR + "trinity_out"
# Wildcards for input files
(LIBRARY, FASTQ, SENS) = glob_wildcards(LIBRARY_DIR + "{mmetsp}/{reads}_{type}.fastq.gz")
# ALL
rule all:
input:
trinity_out = TRINITY_DIR + "/Trinity.fasta",
forward = directory(TRIMMOMATIC_DIR + "/forward.trimmomatic.paired.fastq"),
reverse = directory(TRIMMOMATIC_DIR + "/reverse.trimmomatic.paired.fastq")
# TRINITY: This rule is use to de novo assemble filtered FASTQ files into
# contigs.
rule trinity:
input:
forward = expand(LIBRARY_DIR + "{mmetsp}" + "/" + "{reads}_1.fastq.gz", mmetsp=LIBRARY, reads=FASTQ),
reverse = expand(LIBRARY_DIR + "{mmetsp}" + "/" + "{reads}_2.fastq.gz", mmetsp=LIBRARY, reads=FASTQ),
output:
trinity_out = TRINITY_DIR + "/Trinity.fasta",
forward = directory(TRIMMOMATIC_DIR + "/forward.trimmomatic.paired.fastq"),
reverse = directory(TRIMMOMATIC_DIR + "/reverse.trimmomatic.paired.fastq"),
log:
OUT_DIR + "logs/trinity/trinity.log"
params:
max_memory = config["trinity_params"]["max_memory"],
trinity_dir = directory(TRINITY_DIR),
trimmomatic_out = directory(TRIMMOMATIC_DIR),
trimmomatic_params = config["trimmomatic_params"],
illumina_adaptor = config["samples"]["illumina_adaptor"],
threads:
config["threads"]["trinity"]
run:
shell("""
{trinity} \
--seqType fq \
--left {input.forward} \
--right {input.reverse} \
--trimmomatic \
--quality_trimming_params "ILLUMINACLIP:{params.illumina_adaptor}:2:30:10 {params.trimmomatic_params}" \
--normalize_reads \
--normalize_by_read_set \
--output {params.trinity_dir} \
--CPU {threads} \
--output {params.trinity_dir} \
--CPU {threads} \
--max_memory {params.max_memory} > {log}
mkdir -p {params.trimmomatic_out}
mkdir -p {params.trimmomatic_out}/forward.trimmomatic.paired.fastq
mkdir -p {params.trimmomatic_out}/forward.trimmomatic.paired.fastq
mkdir -p {params.trimmomatic_out}/forward.trimmomatic.unpaired.fastq
mkdir -p {params.trimmomatic_out}/reverse.trimmomatic.unpaired.fastq
mv {params.trinity_dir}/*P.qtrim.gz {output.forward}
mv {params.trinity_dir}/*P.qtrim.gz {output.reverse}
mv {params.trinity_dir}/*U.qtrim.fq {params.trimmomatic_out}/forward.trimmomatic.unpaired.fastq
mv {params.trinity_dir}/*U.qtrim.fq {params.trimmomatic_out}/reverse.trimmomatic.unpaired.fastq
""")
有人可以帮我吗?