snakemake如何编码对分析

时间:2017-09-19 21:56:51

标签: bioinformatics snakemake

我想使用配对样本(肿瘤和正常)进行gatk重新校准。我需要使用pandas解析数据。这就是我的想法。

expand("mapped_reads/merged_samples/{sample[1][tumor]}/{sample[1][tumor]}_{sample[1][normal]}.bam", sample=read_table(config["conditions"], ",").iterrows())

这是条件文件:

432,433
434,435

我写了这条规则:

rule gatk_RealignerTargetCreator:
    input:
          "mapped_reads/merged_samples/{tumor}.sorted.dup.reca.bam",
          "mapped_reads/merged_samples/{normal}.sorted.dup.reca.bam",

    output:
        "mapped_reads/merged_samples/{tumor}/{tumor}_{normal}.realign.intervals"
    params:
        genome=config['reference']['genome_fasta'],
        mills= config['mills'],
        ph1_indels= config['know_phy'],
    log:
        "mapped_reads/merged_samples/logs/{tumor}_{normal}.realign_info.log"
    threads: 8
    shell:
        "gatk -T RealignerTargetCreator -R {params.genome} {params.custom} "
        "-nt {threads} "
        "-I {wildcard.tumor} -I {wildcard.normal}  -known {params.ph1_indels} "
        "-o {output} >& {log}"

我有这个错误:

InputFunctionException in line 17 of /home/maurizio/Desktop/TEST_exome/rules/samfiles.rules:
KeyError: '432/432_433'
Wildcards:
sample=432/432_433

这是samfiles.rules:

rule samtools_merge_bam:
    """
    Merge bam files for multiple units into one for the given sample.
    If the sample has only one unit, files will be copied.
    """
    input:
        lambda wildcards: expand("mapped_reads/bam/{unit}_sorted.bam",unit=config["samples"][wildcards.sample])
    output:
        "mapped_reads/merged_samples/{sample}.bam"
    benchmark:
        "benchmarks/samtools/merge/{sample}.txt"
    run:
        if len(input) > 1:
            shell("/illumina/software/PROG2/samtools-1.3.1/samtools merge {output} {input}")
        else:
            shell("cp {input} {output} && touch -h {output}")

1 个答案:

答案 0 :(得分:1)

我只能猜测,因为您没有显示所有相关规则,但我会说错误发生是因为规则samtools_merge_bam也适用于某些后来的bam文件,其中您具有模式{{1} } ...

作为解决方案,您必须解决这种歧义(请参阅snakemake教程)。例如,您可以将{tumor}/{tumor}_{normal}的通配符限制为不包含任何斜杠。

samtools_merge_bam

您可以将约束全局或放在wildcard_constraints: sample="[^/]+" 规则中。