Snakemake决定执行期间执行哪些规则

时间:2020-10-10 00:58:03

标签: snakemake

我正在研究生物信息学管道,该管道必须能够运行不同的规则以根据输入文件的内容产生不同的输出:

def foo(file):
 '''
 Function will read the file contents and output a boolean value based on its contents
 '''
# Code to read file here...
return bool

rule check_input:
  input: "input.txt"
  run:
     bool = foo("input.txt")

rule bool_is_True:
  input: "input.txt"
  output: "out1.txt"
  run:
    # Some code to generate out1.txt. This rule is supposed to run only if foo("input.txt") is true

rule bool_is_False:
  input: "input.txt"
  output: "out2.txt"
  run:
    # Some code to generate out2.txt. This rule is supposed to run only if foo("input.txt") is False

如何编写规则以处理这种情况?另外,如果在执行check_input规则之前输出文件未知,我该如何写我的第一个规则?

谢谢!

2 个答案:

答案 0 :(得分:1)

是的,snakemake必须在执行规则之前知道要生成哪些文件。因此,我建议您使用一个函数来读取所谓的“输入文件”并相应地定义工作流的输出。

例如:

def getTargetsFromInput():
    targets = list()
    ## read file and add target files to targets
    return targets

rule all:
    input: getTargetsFromInput()

...

您可以在snakemake命令行上使用--config参数定义输入文件的路径,或直接使用某种结构化的输入文件(yaml,json)并在Snakefile中使用关键字configfile: https://snakemake.readthedocs.io/en/stable/snakefiles/configuration.html

答案 1 :(得分:1)

感谢Eric。我可以使用它:

def getTargetsFromInput(file):
    with open(file) as f:
        line = f.readline()
        if line.strip() == "out1":
            return "out1.txt"
        else:
            return "out2.txt"

rule all:
    input: getTargetsFromInput("input.txt")

rule out1:
    input: "input.txt"
    output: "out1.txt"
    run: shell("echo 'out1' > out1.txt")

rule out2:
    input: "input.txt"
    output: "out2.txt"
    run: shell("echo 'out2' > out2.txt")