我有以下规则:
rule run_example:
input:
counts=config['output_dir'] + "/Counts/skin.txt"
params:
chrom=lambda wildcards: gene_chrom()[wildcards.SigGene]
output:
config['out_refbias'] + "/{SigGene}.txt"
script:
"Scripts/run_example.R"
with SigGene=["gene1", "gene2"]
I define the following function:
def gene_chrom(File=config['output_dir'] + "/genes2test.txt", sep=" "):
""" Makes a dictionary with keys gene and values chromosome from a file with first col gene_id and second col CHROM """
data=pd.read_csv(File, sep=sep)
keys=list(data['gene_id'])
values=[str(x) for x in data['CHROM']]
dic=dict(zip(keys,values))
return dic
我将规则提交给集群以并行运行作业。对于一些工作,我会收到以下错误消息:
Snakefile的第67行中的FileNotFoundError: [2020年6月23日星期二09:47:16] [错误2]文件b'/ scratch / genes2test.txt'不存在:b'/ scratch / genes2test.txt'
该文件存在,并且在规则的所有实例之间共享。大多数作业都能够读取文件并运行完成,但是有些作业因上述错误消息而失败。