snakemake:通过更改所有命令行来丢失最后一条规则

时间:2019-09-23 10:21:55

标签: snakemake

下面是我的snakemake代码,如果我不注释掉第28,29行代码,这是规则all-> input->第1、2nd命令行,那么我就无法获得最后一条规则 varscan_somatic ,即空运行输出如下:

Job counts:
        count   jobs
        1       all
        25      mpileup_analysis
        25      normal_mpileup
        25      tumor_mpileup
        76
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.

但是如果我确实注释掉28,29行代码,即规则all-> input->第1、2nd命令行,那么我可以获得最后一条规则 varscan_somatic ,也就是说,空运行输出如下:

Job counts:
        count   jobs
        1       all
        25      mpileup_analysis
        25      normal_mpileup
        25      tumor_mpileup
        25      varscan_somatic
        101
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.

我不知道为什么会这样?有人可以给我一些建议吗?非常感谢您的帮助。

import re
import os

mydict = dict()
with open("config.txt") as HD:
    for line in HD:
        line = line.rstrip()
        if line.startswith("#"):
            continue
        value,field = re.split("\s*=\s*",line)
        mydict[value] = field

VarScan    = mydict['VarScan']
SAMtools   = mydict['SAMtools']
REFERENCE  = mydict['REFERENCE']
PL_MPPANA  = mydict['MPPANA']

chrlist = [i for i in os.listdir("call_region") if i.endswith(".region.bed")]
def replace(str1):
    str2 = str1.replace(".region.bed","")
    return str2
chrlist = map(replace,chrlist)

configfile:"paired_test.yaml"

rule all:
    input:
#        expand("varscan_somatic/{sample}/{chrid}.normal.mpileup.analysis",sample=config['samples'],chrid=chrlist),
#        expand("varscan_somatic/{sample}/{chrid}.tumor.mpileup.analysis",sample=config['samples'],chrid=chrlist),
        expand("varscan_somatic/{sample}/{chrid}.snp.vcf",sample=config['samples'],chrid=chrlist),
        expand("varscan_somatic/{sample}/{chrid}.indel.vcf",sample=config['samples'],chrid=chrlist)

rule normal_mpileup:
    input:
        bam=lambda wc:config['samples'][wc.sample][3],
        bed="call_region/{chrid}.region.bed"
    output:
        "varscan_somatic/{sample}/{chrid}.normal.mpileup"
    log:
        "log/varscan_somatic/{sample}/{chrid}.normal.mpileup.log"
    shell:
        "{SAMtools} mpileup -f {REFERENCE} -l {input.bed} "
        "{input.bam} -d1000 -Q10 -q10 -o {output} "
        "1>{log} 2>&1"

rule tumor_mpileup:
    input:
        bam=lambda wc:config['samples'][wc.sample][1],
        bed="call_region/{chrid}.region.bed"
    output:
        "varscan_somatic/{sample}/{chrid}.tumor.mpileup"
    log:
        "log/varscan_somatic/{sample}/{chrid}.tumor.mpileup.log"
    shell:
        "{SAMtools} mpileup -f {REFERENCE} -l {input.bed} "
        "{input.bam} -d1000 -Q10 -q10 -o {output} "
        "1>{log} 2>&1"

rule mpileup_analysis:
    input:
        tumor="varscan_somatic/{sample}/{chrid}.tumor.mpileup",
        norml="varscan_somatic/{sample}/{chrid}.normal.mpileup"
    output:
        tumor="varscan_somatic/{sample}/{chrid}.tumor.mpileup.analysis",
        norml="varscan_somatic/{sample}/{chrid}.normal.mpileup.analysis"
    log:
        "log/varscan_somatic/{sample}/{chrid}.mpileup_analysis.log"
    shell:
        "{PL_MPPANA} {input.tumor} {output.tumor} 1>{log} 2>&1 "
        "&& {PL_MPPANA} {input.norml} {output.norml} 1>>{log} 2>&1"

rule varscan_somatic:
    input:
        tumor="varscan_somatic/{sample}/{chrid}.tumor.mpileup",
        norml="varscan_somatic/{sample}/{chrid}.normal.mpileup",
        temp1="varscan_somatic/{sample}/{chrid}.tumor.mpileup.analysis",
        temp2="varscan_somatic/{sample}/{chrid}.normal.mpileup.analysis"
    output:
        "varscan_somatic/{sample}/{chrid}.snp",
        "varscan_somatic/{sample}/{chrid}.indel"
    log:
        "log/varscan_somatic/{sample}/{chrid}.varscan_somatic.log"
    params:
        "varscan_somatic/{sample}/{chrid}",
        "--validation 1 --output-vcf 1"
    shell:
        "{VarScan} somatic {input.tumor} {input.norml} {params} 1>{log} 2>&1"
>>>config.txt
VarScan = /path/my/varscan
REFERENCE = /path/my/hg19.fa
SAMtools = /path/my/samtools
MPPANA = /path/my/mppana.pl
>>> paired.yaml
samples:
    S01:['S01','S01.bqsr.bam','S02','S02.bqsr.bam']

下面是:没有注释的蛇形摘要

snakemake -s bin/varscan_somatic_paired.py -np --forceall --summary
Building DAG of jobs...
output_file     date    rule    version log-file(s)     status  plan
varscan_somatic/S001/chr21.tumor.mpileup.analysis       Thu Sep 26 02:34:56 2019        mpileup_analysis        -       log/varscan_somatic/S001/chr21.mpileup_analysis.log ok      update pending
varscan_somatic/S001/chr21.normal.mpileup.analysis      Thu Sep 26 02:34:56 2019        mpileup_analysis        -       log/varscan_somatic/S001/chr21.mpileup_analysis.log ok      update pending
varscan_somatic/S001/chr10.tumor.mpileup.analysis       Wed Sep 25 22:56:59 2019        mpileup_analysis        -       log/varscan_somatic/S001/chr10.mpileup_analysis.log ok      update pending

下面是:带注释的蛇形摘要

Building DAG of jobs...
output_file     date    rule    version log-file(s)     status  plan
varscan_somatic/S001/chr19.snp.vcf      Wed Sep 25 22:14:13 2019        varscan_somatic -       log/varscan_somatic/S001/chr19.varscan_somatic.log ok       update pending
varscan_somatic/S001/chr19.indel.vcf    Wed Sep 25 22:14:13 2019        varscan_somatic -       log/varscan_somatic/S001/chr19.varscan_somatic.log ok       update pending
varscan_somatic/S001/chr14.snp.vcf      Thu Sep 26 01:22:17 2019        varscan_somatic -       log/varscan_somatic/S001/chr14.varscan_somatic.log ok       update pending

enter image description here

enter image description here

1 个答案:

答案 0 :(得分:0)

此代码是错误的: 修复之前:

def replace(str1):
    str2 = str1.replace(".region.bed","")
    return str2
chrlist = map(replace,chrlist)

修复后:

def replace(str1):
    str2 = str1.replace(".region.bed","")
    return str2
chrlist = list(map(replace,chrlist))

那一切都很好。