我正在使用BioPython在我的工作场所的计算集群中读取120个序列(每个约20kb)的对齐。我的代码在我自己的计算机上完美运行(Mac OSX Mavericks,运行时间<1秒),并且当我在集群上交互使用Python时也能正常工作。但是,当我尝试运行包含我的代码的较长程序时(下面),我遇到了MemoryError。
from Bio import AlignIO
alignment = AlignIO.read("my_aligment.fasta","fasta")
回溯看起来像这样:
Traceback (most recent call last):
File "./my_program.py", line 42, in <module>
alignment = AlignIO.read(align, "fasta")
File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/AlignIO/__init__.py", line 423, in read
first = next(iterator)
File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/AlignIO/__init__.py", line 370, in parse
for a in i:
File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/AlignIO/__init__.py", line 270, in _SeqIO_to_alignment_iterator
records = list(SeqIO.parse(handle, format, alphabet))
File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/SeqIO/__init__.py", line 586, in parse
for r in i:
File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/SeqIO/FastaIO.py", line 121, in FastaIterator
for title, sequence in SimpleFastaParser(handle):
File "/workplace/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/Bio/SeqIO/FastaIO.py", line 55, in SimpleFastaParser
line = handle.readline()
MemoryError
也许我在服务器上遇到内存错误,但我认为使用AlignIO的全部意义在于它在内存方面非常有效。我很困惑,因为当我以交互方式使用Python时,我可以在集群上成功运行该行代码。
其他相关资料: Python版本2.7.1, Biopython版本1.64
感谢您提供任何帮助或建议!