Snakemake:扩展中的Zip选项不起作用

时间:2018-07-31 09:24:42

标签: python snakemake

我在我的snakamake脚本中使用了expand和glob_wildcards。我的目录库中有两个子目录,它们的fastq正向和反向文件的结构如下:

libraries:
    - MMETSP1:
           - SRR19_1
           - SRR19_2
    - MMETSP2:
          - SRR20_1
          - SRR20_2

我正在使用expand的 zip选项在两个目录中获得所有可能的组合。但是我想要的只是带有逗号分隔的正向文件(例如:SRR19_1,SRR20_1)的列表,对于反向文件(例如:SRR19_2,SRR20_2)也是如此。

当我删除zip选项时,会出现此错误:

MissingInputException in line 159 of /projet/fr2424/sib/gniang/script/snakemake/scripts/dntap_test.py:
Missing input files for rule trinity:
/projet/fr2424/sib/gniang/script/snakemake/libraries/MMESTP1/SRR1300320_1.fastq.gz
/projet/fr2424/sib/gniang/script/snakemake/libraries/MMESTP1/SRR1300320_2.fastq.gz
/projet/fr2424/sib/gniang/script/snakemake/libraries/MMETSP2/SRR1300319_2.fastq.gz
/projet/fr2424/sib/gniang/script/snakemake/libraries/MMETSP2/SRR1300319_1.fastq.gz

我正在使用此脚本:

#!/usr/bin/python

#  Imports
import os
import glob
import sys

# Get current working directory
dir_path = os.getcwd()

# User defined ouput directory
OUT_DIR = config["directories"]["outdir"]

#  User defined input directory
LIBRARY_DIR = config["directories"]["libraries"]

# Relative output directories
TRINITY_DIR = OUT_DIR + "trinity_out"

# Wildcards for input files
(LIBRARY, FASTQ, SENS) = glob_wildcards(LIBRARY_DIR + "{mmetsp}/{reads}_{type}.fastq.gz")


# ALL
rule all:
    input:
        trinity_out = TRINITY_DIR + "/Trinity.fasta",
        forward = directory(TRIMMOMATIC_DIR + "/forward.trimmomatic.paired.fastq"),
        reverse = directory(TRIMMOMATIC_DIR + "/reverse.trimmomatic.paired.fastq")

# TRINITY: This rule is use to de novo assemble filtered FASTQ files into
# contigs.
rule trinity:
    input:
        forward = expand(LIBRARY_DIR + "{mmetsp}" + "/" + "{reads}_1.fastq.gz", mmetsp=LIBRARY, reads=FASTQ),
        reverse = expand(LIBRARY_DIR + "{mmetsp}" + "/" + "{reads}_2.fastq.gz", mmetsp=LIBRARY, reads=FASTQ),
    output:
        trinity_out = TRINITY_DIR + "/Trinity.fasta",
        forward = directory(TRIMMOMATIC_DIR + "/forward.trimmomatic.paired.fastq"),
        reverse = directory(TRIMMOMATIC_DIR + "/reverse.trimmomatic.paired.fastq"),
    log:
        OUT_DIR + "logs/trinity/trinity.log"
    params:
        max_memory = config["trinity_params"]["max_memory"],
        trinity_dir = directory(TRINITY_DIR),
        trimmomatic_out = directory(TRIMMOMATIC_DIR),
        trimmomatic_params = config["trimmomatic_params"],
        illumina_adaptor = config["samples"]["illumina_adaptor"],
    threads:
        config["threads"]["trinity"]
    run:

        shell("""

        {trinity} \
        --seqType fq \
        --left {input.forward}  \
        --right {input.reverse} \
        --trimmomatic \
        --quality_trimming_params "ILLUMINACLIP:{params.illumina_adaptor}:2:30:10 {params.trimmomatic_params}" \
        --normalize_reads \
        --normalize_by_read_set \
        --output {params.trinity_dir} \
        --CPU {threads} \
        --output {params.trinity_dir} \
        --CPU {threads} \
        --max_memory {params.max_memory} > {log}

        mkdir -p {params.trimmomatic_out}
        mkdir -p {params.trimmomatic_out}/forward.trimmomatic.paired.fastq
        mkdir -p {params.trimmomatic_out}/forward.trimmomatic.paired.fastq
        mkdir -p {params.trimmomatic_out}/forward.trimmomatic.unpaired.fastq
        mkdir -p {params.trimmomatic_out}/reverse.trimmomatic.unpaired.fastq

        mv {params.trinity_dir}/*P.qtrim.gz {output.forward}
        mv {params.trinity_dir}/*P.qtrim.gz {output.reverse}
        mv {params.trinity_dir}/*U.qtrim.fq {params.trimmomatic_out}/forward.trimmomatic.unpaired.fastq
        mv {params.trinity_dir}/*U.qtrim.fq {params.trimmomatic_out}/reverse.trimmomatic.unpaired.fastq
        """)

有人可以帮我吗?

0 个答案:

没有答案