重复执行Snakemake规则,直到满足某些条件

时间:2019-12-30 18:51:09

标签: loops snakemake checkpoint

我想对需要使用Snakemake的流程进行重复操作,直到满足某些条件为止。事先无法确定需要多少次步骤。它可以是1或6或任何其他数字。

我的直觉是Snakemake无法做到的,因为有向图 Acyclic 有向图和所有...

不过,我希望检查点可能会有所帮助,因为它会触发对DAG的重新评估,但我只是无法确切了解其工作原理。

Snakefile中是否可能存在循环?

谢谢!


在下面的出色答案中添加一些有关实际发生情况的评论。希望当我不可避免地重新审视这个问题时,它可以帮助其他人和我自己。

all:  call function all_input to determine rule's input requirements.
all_input:  file "succes.txt" doesn't exist.  do checkpoint keep_trying with i == 1.     
keep_trying:  output "round_1" doesn't exist.  do run section.  random() decides to touch output[0], which is "round_1".

snakemake reevaluates graph after checkpoint is complete

all:  call function all_input to determine rule's input requirements.
all_input:  file "succes.txt" doesn't exist.  do checkpoint keep_trying with i == 2.
keep_trying:   output "round_2" doesn't exist.  do run section.  random() decides to touch output[0], which is "round_2".

snakemake reevaluates graph after checkpoint is complete

all:  call function all_input to determine rule's input requirements.
all_input:  file "succes.txt" doesn't exist.  do checkpoint keep_trying with i == 3.
keep_trying:  output "round_3" doesn't exist.  do run section.  random() decides to touch "succes.txt".

snakemake reevaluates graph after checkpoint is complete

all:  call function all_input to determine rule's input requirements.
all_input:  file "succes.txt" exists.  return "success.txt" to rule all.
all:  input requirement is "success.txt", which is now satisfied.

1 个答案:

答案 0 :(得分:1)

您是对的,您需要检查点!这是一个满足您需求的小例子:

import os
from pathlib import Path


tries = 0
def all_input(wildcards):
    global tries
    if not os.path.exists("succes.txt"):
        tries += 1
        checkpoints.keep_trying.get(i=tries)
    else:
        return "succes.txt"


rule all:
    input:
        all_input


checkpoint keep_trying:
    output:
        "round_{i}"
    run:
        import random
        if random.random() > 0.9:
            Path('succes.txt').touch()
        Path(output[0]).touch()

这里我们说rule all需要从函数all_input返回的内容作为输入。此功能检查文件succes.txt是否已经存在。如果不是,它将触发运行检查点并继续尝试,这可能会生成succes.txt文件(10%的机会)。如果succes.txt实际上存在,那么这就是rule all的输入,然后snakemake成功退出。