Question

我正在尝试在循环中使用Snakemake规则，以便该规则将上一次迭代的输出作为输入。那有可能吗，如果可以，我该怎么做？

这是我的例子

设置测试数据

mkdir -p test
echo "SampleA" > test/SampleA.txt
echo "SampleB" > test/SampleB.txt

Snakemake

SAMPLES = ["SampleA", "SampleB"]

rule all:
    input:
        # Output of the final loop
        expand("loop3/{sample}.txt", sample = SAMPLES)


#### LOOP ####
for i in list(range(1, 4)):
    # Setup prefix for input
    if i == 1:
        prefix = "test"
    else:
        prefix = "loop%s" % str(i-1)

    # Setup prefix for output
    opref =  "loop%s" % str(i)

    # Rule
    rule loop_rule:
        input:
            prefix+"/{sample}.txt"
        output:
            prefix+"/{sample}.txt"
            #expand("loop{i}/{sample}.txt", i = i, sample = wildcards.sample)
        params:
            add=prefix
        shell:
            "awk '{{print $0, {params.add}}}' {input} > {output}"

尝试运行该示例将产生错误CreateRuleException in line 26 of /Users/fabiangrammes/Desktop/Projects/snake_loop/Snakefile: The name loop_rule is already used by another rule。如果有人发现使该功能正常工作的选择，那就太好了！

谢谢！

Answer 1

我的理解是，您的规则在运行之前已转换为python代码，并且在此过程中按顺序运行了Snakefile中存在的所有原始python代码。可以将其视为您的snakemake规则被评估为python函数。

但是有一个约束，即任何规则只能对一个函数求值一次。

您可以具有if / else表达式，并根据配置值等差异地评估规则（一次），但是不能多次评估规则。

我不太确定如何重写Snakefile来实现所需的功能。有没有一个真实的示例可以给出需要循环构造的地方？

---编辑

对于固定数量的迭代，可以使用输入函数多次运行规则。（不过，我会提醒您不要这样做，请格外小心，以免发生无限循环）

try{
   connectToDatabase();
   }catch(error){
   waitForHalfMinute();
   connectToDatabase();
}

Answer 2

我认为这是使用递归编程的绝佳机会。而不是为每个迭代明确包含条件，而是编写一条从迭代Map到(n-1)过渡的规则。因此，遵循以下原则：

正如@RussHyde所说，您需要积极主动地确保不会触发无限循环。为此，我们确保SAMPLES = ["SampleA", "SampleB"] rule all: input: expand("loop3/{sample}.txt", sample=SAMPLES) def recurse_sample(wcs): n = int(wcs.n) if n == 1: return "test/%s.txt" % wcs.sample elif n > 1: return "loop%d/%s.txt" % (n-1, wcs.sample) else: raise ValueError("loop numbers must be 1 or greater: received %s" % wcs.n) rule loop_n: input: recurse_sample output: "loop{n}/{sample}.txt" wildcard_constraints: sample="[^/]+", n="[0-9]+" shell: """ awk -v loop='loop{wildcards.n}' '{{print $0, loop}}' {input} > {output} """中涵盖所有案例，并使用recurse_sample确保匹配准确。

Snakemake在循环中使用规则

2 个答案: