我想对需要使用Snakemake的流程进行重复操作,直到满足某些条件为止。事先无法确定需要多少次步骤。它可以是1或6或任何其他数字。
我的直觉是Snakemake无法做到的,因为有向图 Acyclic 有向图和所有...
不过,我希望检查点可能会有所帮助,因为它会触发对DAG的重新评估,但我只是无法确切了解其工作原理。
Snakefile中是否可能存在循环?
谢谢!
在下面的出色答案中添加一些有关实际发生情况的评论。希望当我不可避免地重新审视这个问题时,它可以帮助其他人和我自己。
all: call function all_input to determine rule's input requirements.
all_input: file "succes.txt" doesn't exist. do checkpoint keep_trying with i == 1.
keep_trying: output "round_1" doesn't exist. do run section. random() decides to touch output[0], which is "round_1".
snakemake reevaluates graph after checkpoint is complete
all: call function all_input to determine rule's input requirements.
all_input: file "succes.txt" doesn't exist. do checkpoint keep_trying with i == 2.
keep_trying: output "round_2" doesn't exist. do run section. random() decides to touch output[0], which is "round_2".
snakemake reevaluates graph after checkpoint is complete
all: call function all_input to determine rule's input requirements.
all_input: file "succes.txt" doesn't exist. do checkpoint keep_trying with i == 3.
keep_trying: output "round_3" doesn't exist. do run section. random() decides to touch "succes.txt".
snakemake reevaluates graph after checkpoint is complete
all: call function all_input to determine rule's input requirements.
all_input: file "succes.txt" exists. return "success.txt" to rule all.
all: input requirement is "success.txt", which is now satisfied.
答案 0 :(得分:1)
您是对的,您需要检查点!这是一个满足您需求的小例子:
import os
from pathlib import Path
tries = 0
def all_input(wildcards):
global tries
if not os.path.exists("succes.txt"):
tries += 1
checkpoints.keep_trying.get(i=tries)
else:
return "succes.txt"
rule all:
input:
all_input
checkpoint keep_trying:
output:
"round_{i}"
run:
import random
if random.random() > 0.9:
Path('succes.txt').touch()
Path(output[0]).touch()
这里我们说rule all
需要从函数all_input
返回的内容作为输入。此功能检查文件succes.txt
是否已经存在。如果不是,它将触发运行检查点并继续尝试,这可能会生成succes.txt
文件(10%的机会)。如果succes.txt
实际上存在,那么这就是rule all
的输入,然后snakemake成功退出。