snakemake替代问题

时间:2018-10-24 10:46:06

标签: snakemake

我是snakemake的新手,以下代码存在问题,该代码应一个接一个地处理9个fastq文件并应用fastqc。

smp应该采用以下值:

UG1_S12 UG2_S13 UG3_S14 UR1_S1 UR2_S2 UR3_S3 UY1_S6 UY2_S7 UY3_S8

我跑步时可以工作的

SAMPLES, = glob_wildcards("reads/merged_s{smp}_L001.fastq.gz")
NB_SAMPLES = len(SAMPLES)

for smp in SAMPLES:
  message("Sample " + smp + " will be processed")
message("N= " + str(NB_SAMPLES))

问题是替换了{smp},在mv命令中,该替换首先由UY2_S7替换为UY3_S8。

如何确保在同一规则的两个子命令中使用相同的替换?

我当前的代码(inspired by):

SAMPLES, = glob_wildcards("reads/merged_s{smp}_L001.fastq.gz")

rule all: 
  input: 
        expand("reads/merged_s{smp}_L001.fastq.gz", smp=SAMPLES),
        "results/multiqc.html"

rule fastqc:
    """
    Run FastQC on each FASTQ file.
    """
    input:
        "reads/merged_s{smp}_L001.fastq.gz"
    output:
        "results/{smp}_fastqc.html",
        "intermediate/{smp}_fastqc.zip"
    version: "1.0"
    shadow: "minimal"
    threads: 8
    shell:
        """
        # Run fastQC and save the output to the current directory
        fastqc {input} -t {threads} -q -d . -o .

        # Move the files which are used in the workflow
        mv merged_s{smp}_L001_fastqc.html {output[0]}
        mv merged_s{smp}_L001_fastqc.zip {output[1]}
        """

错误:

Error in rule fastqc:
    jobid: 0
    output: results/UY2_S7_fastqc.html, intermediate/UY2_S7_fastqc.zip

RuleException:
CalledProcessError in line 60 of Snakefile:
Command ' set -euo pipefail;  
        # Run fastQC and save the output to the current directory
        fastqc reads/merged_sUY2_S7_L001.fastq.gz -t 8 -q -d . -o .

        # Move the files which are used in the workflow
        mv merged_sUY3_S8_L001_fastqc.html results/UY2_S7_fastqc.html
        mv merged_sUY3_S8_L001_fastqc.zip intermediate/UY2_S7_fastqc.zip ' returned non-zero exit status 130.
  File "Snakefile", line 60, in __rule_fastqc
  File "/opt/biotools/miniconda2/envs/snakemake-tutorial/lib/python3.6/concurrent/futures/thread.py", line 56, in run

1 个答案:

答案 0 :(得分:2)

如果要在shell命令中使用通配符,则必须使用{wildcards.smp}
可能发生的情况是shell命令中的{smp}接受了上述for循环的最后一次迭代的值。所以改变:

shell:
    """
    # Run fastQC and save the output to the current directory
    fastqc {input} -t {threads} -q -d . -o .

    # Move the files which are used in the workflow
    mv merged_s{smp}_L001_fastqc.html {output[0]}
    mv merged_s{smp}_L001_fastqc.zip {output[1]}
    """

进入:

shell:
    """
    # Run fastQC and save the output to the current directory
    fastqc {input} -t {threads} -q -d . -o .

    # Move the files which are used in the workflow
    mv merged_s{wildcards.smp}_L001_fastqc.html {output[0]}
    mv merged_s{wildcards.smp}_L001_fastqc.zip {output[1]}
    """

我还没有检查其余的代码。