Question

我正在编写一个python脚本（版本2.7），它将指定目录中的每个输入文件（.nexus格式）更改为.fasta格式。 Biopython模块SeqIO.convert完美地处理单独指定文件的转换，但是当我尝试使用os.walk在目录上自动执行该过程时，我无法正确地将每个输入文件的路径名传递给SeqIO.convert。我哪里错了？我是否需要使用os.path模块中的join（）并将完整路径名传递给SeqIO.convert？

    #Import modules
    import sys
    import re
    import os
    import fileinput

    from Bio import SeqIO

    #Specify directory of interest
    PSGDirectory = "/Users/InputDirectory”
    #Create a class that will run the SeqIO.convert function repeatedly
    def process(filename):
      count = SeqIO.convert("files", "nexus", "files.fa", "fasta", alphabet= IUPAC.ambiguous_dna)
    #Make sure os.walk works correctly
    for path, dirs, files in os.walk(PSGDirectory):
       print path
       print dirs
       print files

    #Now recursively do the count command on each file inside PSGDirectory
    for files in os.walk(PSGDirectory):
       print("Converted %i records" % count)
       process(files)

当我运行脚本时，我收到以下错误消息： Traceback (most recent call last): File "nexus_to_fasta.psg", line 45, in <module> print("Converted %i records" % count) NameError: name 'count' is not defined This conversation非常有用，但我不知道在哪里插入join（）函数语句。 Here is an example of one of my nexus files 谢谢你的帮助！

Answer 1

有一些事情正在发生。

首先，您的流程功能不会返回'count'。你可能想要：

def process(filename):
   return seqIO.convert("files", "nexus", "files.fa", "fasta", alphabet=IUPAC.ambiguous_dna) 
   # assuming seqIO.convert actually returns the number you want

此外，当您编写for files in os.walk(PSGDirectory)时，您正在操作os.walk返回的3元组，而不是单个文件。你想做这样的事情（注意使用os.path.join）：

for root, dirs, files in os.walk(PSGDirectory):
    for filename in files:
            fullpath = os.path.join(root, filename)
            print process(fullpath)

更新

所以我查看了seqIO.convert的文档，并希望通过以下方式调用：

in_file - 输入句柄或文件名
in_format - 输入文件格式，小写字符串
out_file - 输出句柄或文件名
out_format - 输出文件格式，小写字符串
字母 - 可选择的字母表

in_file是要转换的文件的名称，最初你只是用“files”调用seqIO.convert。

所以你的过程函数应该是这样的：

def process(filename):
    return seqIO.convert(filename, "nexus", filename + '.fa', "fasta", alphabet=IUPAC.ambiguous_dna)

如何将Biopython SeqIO.convert（）传递到目录中的多个文件？

1 个答案: