修改python脚本以运行多个输入文件

时间:2017-08-28 15:24:29

标签: python filenames

我是python的新手,我有一个python脚本来运行特定文件(input1.txt)并生成一个输出(output1.fasta),但我想为多个文件运行此脚本,例如:input2.txt,input3.txt ...并生成相应的输出:output2.fasta,output3.fasta

vec[vec==''] <- names(vec)[vec=='']

我尝试添加glob函数,但我不知道如何处理输出文件名。

from Bio import SeqIO

fasta_file = "sequences.txt" 
wanted_file = "input1.txt" 
result_file = "output1.fasta" 

wanted = set()
with open(wanted_file) as f:
    for line in f:
        line = line.strip()
        if line != "":
            wanted.add(line)
fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
with open(result_file, "w") as f:
    for seq in fasta_sequences:
        if seq.id in wanted:
            SeqIO.write([seq], f, "fasta")

错误消息为:NameError:未定义名称“result_file”

1 个答案:

答案 0 :(得分:3)

您的glob正在拉动您的&#34;序列&#34;文件以及输入,因为*.txt包含sequences.txt文件。如果&#34; fasta&#34;文件总是一样的,你只想迭代输入文件,然后你需要

for filename in glob.glob('input*.txt'):

此外,要遍历整个过程,也许您希望将其放在方法中。如果始终创建输出文件名以对应输入,则可以动态创建。

from Bio import SeqIO

def create_fasta_outputs(fasta_file, wanted_file):
    result_file = wanted_file.replace("input","output").replace(".txt",".fasta")

    wanted = set()
    with open(wanted_file) as f:
        for line in f:
            line = line.strip()
            if line != "":
                wanted.add(line)
    fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
    with open(result_file, "w") as f:
        for seq in fasta_sequences:
            if seq.id in wanted:
                SeqIO.write([seq], f, "fasta")

fasta_file = "sequences.txt"
for wanted_file in glob.glob('input*.txt'):
    create_fasta_outputs(fasta_file, wanted_file)