使用BioPython读取FASTA对齐时的内存错误

时间:2015-01-08 17:22:26

标签: python cluster-computing biopython

我正在使用BioPython在我的工作场所的计算集群中读取120个序列(每个约20kb)的对齐。我的代码在我自己的计算机上完美运行(Mac OSX Mavericks,运行时间<1秒),并且当我在集群上交互使用Python时也能正常工作。但是,当我尝试运行包含我的代码的较长程序时(下面),我遇到了MemoryError。

from Bio import AlignIO
alignment = AlignIO.read("my_aligment.fasta","fasta")

回溯看起来像这样:

Traceback (most recent call last):
  File "./my_program.py", line 42, in <module>
    alignment = AlignIO.read(align, "fasta")
  File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/AlignIO/__init__.py", line 423, in read
    first = next(iterator)
  File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/AlignIO/__init__.py", line 370, in parse
    for a in i:
  File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/AlignIO/__init__.py", line 270, in _SeqIO_to_alignment_iterator
    records = list(SeqIO.parse(handle, format, alphabet))
  File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/SeqIO/__init__.py", line 586, in parse
    for r in i:
  File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/SeqIO/FastaIO.py", line 121, in FastaIterator
    for title, sequence in SimpleFastaParser(handle):
  File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/SeqIO/FastaIO.py", line 55, in SimpleFastaParser
    line = handle.readline()
MemoryError

也许我在服务器上遇到内存错误,但我认为使用AlignIO的全部意义在于它在内存方面非常有效。我很困惑,因为当我以交互方式使用Python时,我可以在集群上成功运行该行代码。

其他相关资料: Python版本2.7.1, Biopython版本1.64

感谢您提供任何帮助或建议!

0 个答案:

没有答案