我尝试使用snakemake
从网站下载文件,然后将它们拼凑在一起。
但是,我总是收到错误:Waiting at most 5 seconds for missing files.
MissingOutputException in line 24 of /path/to/Snakefile:
为什么在尝试继续之前,snakemake
只是等待文件下载?将所有读取放在不同目录中会很不方便,而且我不想打扰配置文件,因为这是一次性的Snakefile
。
谢谢!
这是我的剧本:
import os
rule all:
input:
"ONT/yeastONT_combined.fastq.gz",
"trimmed/ERR1938684_1.trim.final.fastq.gz",
"trimmed/ERR1938684_2.trim.final.fastq.gz",
"trimmed/ERR1938684_1.trim.unpaired.fastq.gz",
"trimmed/ERR1938684_2.trim.unpaired.fastq.gz"
rule getONTfwd:
input:
output:
"ONT/ERR1883385_1.fastq.gz",
"ONT/ERR1883386_1.fastq.gz",
"ONT/ERR1883387_1.fastq.gz",
"ONT/ERR1883393_1.fastq.gz",
"ONT/ERR1883395_1.fastq.gz",
"ONT/ERR1883396_1.fastq.gz"
shell:
"""cd ONT \
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/005/ERR1883385/ERR1883385_1.fastq.gz' \
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/006/ERR1883386/ERR1883386_1.fastq.gz' \
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/007/ERR1883387/ERR1883387_1.fastq.gz' \
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/003/ERR1883393/ERR1883393_1.fastq.gz' \
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/005/ERR1883395/ERR1883395_1.fastq.gz' \
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/006/ERR1883396/ERR1883396_1.fastq.gz' \
sleep 300 \
cd .."""
rule combine_ONT:
input:
f1 = "ONT/ERR1883385_1.fastq.gz",
f2 = "ONT/ERR1883386_1.fastq.gz",
f3 = "ONT/ERR1883387_1.fastq.gz",
f4 = "ONT/ERR1883393_1.fastq.gz",
f5 = "ONT/ERR1883395_1.fastq.gz",
f6 = "ONT/ERR1883396_1.fastq.gz"
output:
"ONT/yeastONT_combined.fastq.gz"
shell:
"""cat {input.f1} {input.f2} {input.f3} {input.f4} {input.f5} {input.f6} > {output}"""
答案 0 :(得分:1)
规则getONTfwd
的shell命令中存在语法错误,您可以使用\
转义每个换行符;这导致完整的shell命令被视为一个单独的命令。删除转义字符\
或在转义符号前添加分号以分隔命令(即; \
)
此外,如果您仅使用sleep 300
来提供缓冲时间来下载所有文件,则不需要wget
。正如Johannes的评论中提到的,trimmed/*.fastq.gz
退出只下载了来自url的文件。并且,示例脚本中缺少文件import os
rule all:
input:
"ONT/yeastONT_combined.fastq.gz",
# "trimmed/ERR1938684_1.trim.final.fastq.gz",
# "trimmed/ERR1938684_2.trim.final.fastq.gz",
# "trimmed/ERR1938684_1.trim.unpaired.fastq.gz",
# "trimmed/ERR1938684_2.trim.unpaired.fastq.gz"
rule getONTfwd:
output:
"ONT/ERR1883385_1.fastq.gz",
"ONT/ERR1883386_1.fastq.gz",
"ONT/ERR1883387_1.fastq.gz",
"ONT/ERR1883393_1.fastq.gz",
"ONT/ERR1883395_1.fastq.gz",
"ONT/ERR1883396_1.fastq.gz"
shell:
"""cd ONT
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/005/ERR1883385/ERR1883385_1.fastq.gz'
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/006/ERR1883386/ERR1883386_1.fastq.gz'
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/007/ERR1883387/ERR1883387_1.fastq.gz'
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/003/ERR1883393/ERR1883393_1.fastq.gz'
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/005/ERR1883395/ERR1883395_1.fastq.gz'
wget 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR188/006/ERR1883396/ERR1883396_1.fastq.gz'
cd .."""
rule combine_ONT:
input:
f1 = "ONT/ERR1883385_1.fastq.gz",
f2 = "ONT/ERR1883386_1.fastq.gz",
f3 = "ONT/ERR1883387_1.fastq.gz",
f4 = "ONT/ERR1883393_1.fastq.gz",
f5 = "ONT/ERR1883395_1.fastq.gz",
f6 = "ONT/ERR1883396_1.fastq.gz"
output:
"ONT/yeastONT_combined.fastq.gz"
shell:
"""cat {input.f1} {input.f2} {input.f3} {input.f4} {input.f5} {input.f6} > {output}"""
的规则。
以下是您的示例的已编辑版本,该版本应按预期工作:
=IF(ISERROR(SEARCH("doctor|Fysio|Admin";A2));0;2.3)