Snakemake从文件中读取输入

时间:2019-07-24 20:32:53

标签: snakemake

我正在尝试将在运行期间写入的文件用作另一条规则的输入,但是它总是给我错误FileNotFoundError: [Errno 2] No such file or directory:

有没有办法将其修复或其他实现具有相同的逻辑。

def vc_list(wildcards):
    my_list = []
    with open(wildcards.mydir+"/file_B.txt", 'r') as data_in:
        for line in data_in:
            my_list.append(line.strip())
    return(my_list)

# rule A will process file_A.txt and give me file_B.txt
rule A:
    input: "{mydir}/file_A.txt"
    output: "{mydir}/file_B.txt"
    shell: "seq 1 5 > {output}"  # assume that `seq 1 5` is the output from proicessing the file

rule B:
    input: "{vlaue}"
    output: "{vlaue}.vc"
    shell: "pythoncode.py {input} {output}"

# rule C will process file_B.txt to give me list of values that will be used to expanded the input, then will use rile B to produce it
rule C:
    input:
        processed_file = rules.A.output, #"{mydir}/file_B.txt", 
        my_list = lambda wildcards: expand("{mydir}/{value}.vc", mydir=wildcards.mydir, value=vc_list(wildcards))
    output: "{mydir}/done.txt"
    shell: "touch {output}"
#I always have the error that "{mydir}/file_B.txt" does not exist

现在的错误:

test_loop.snakefile: FileNotFoundError: [Errno 2] No such file or directory: 'read_file/file_B.txt' Wildcards: mydir=read_file

谢谢

2 个答案:

答案 0 :(得分:0)

您的脚本甚至在工作流开始之前就在管道构建阶段失败。

因此,规则AB并不奇怪:Snakemake读取了它们的inputoutput部分,并发现它们没有问题。然后,它开始读取rule C,其中input部分调用vc_list()函数,该函数甚至在工作流开始之前尝试读取文件'read_file / file_B.txt'!确保它找不到文件并产生错误。

关于要做的事情,您首先需要澄清任务。您最有可能尝试在输入规则中使用动态信息。在这种情况下,您需要使用动态文件或检查点。

答案 1 :(得分:0)

我的问题的答案是使用检查点,因为动态将被弃用。 更改逻辑的方法如下:

    rule:
        input: 'done.txt'

    checkpoint A:
        output: 'B.txt'
        shell: 'seq 1 2 > {output}'


    rule N:
        input: "genome.fa"
        output: '{num}.bam'
        shell: "touch {output}"

    rule B:
        input: '{num}.bam'
        output: '{num}.vc'
        shell: "touch {output}"


    def aggregate_input(wildcards):
        with open(checkpoints.A.get(**wildcards).output[0], 'r') as f:
            return [num.rstrip() + '.vc' for num in f]

    rule C:
        input: aggregate_input
        output: touch('done.txt')

信贷转到林志强