Python:NLTK使用WordNet在计算同义词时给出了MemoryError

时间:2016-04-08 19:05:09

标签: python nltk wordnet

我正在尝试运行此代码,我在其中计算两个单词的同义词并计算这两个单词之间的相似性。 Python代码给出了一个MemoryError,如下所示:

代码:

def wordSim(word1,word2):
    maxscore = 0.0
    word1_synsets = word1[1]
    word2_synsets = word2[1]
    for k,j in list(product(*[word1_synsets,word2_synsets])):
        score = k.wup_similarity(j) # Wu-Palmer Similarity
        maxscore = score if maxscore < score else maxscore
    if maxscore >= 0.85:
        return True

def genSynsets(wordList):
    synsetList = map(lambda x: [x,wn.synsets(x.decode('utf-8'))],wordList)
    return synsetList

错误消息:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/global/python/2.7.5/lib/python2.7/site-packages/nltk/corpus/util.py", line 99, in __getattr__
    self.__load()
  File "/global/python/2.7.5/lib/python2.7/site-packages/nltk/corpus/util.py", line 67, in __load
    corpus = self.__reader_cls(root, *self.__args, **self.__kwargs)
  File "/global/python/2.7.5/lib/python2.7/site-packages/nltk/corpus/reader/wordnet.py", line 1045, in __init__
    self._load_lemma_pos_offset_map()
  File "/global/python/2.7.5/lib/python2.7/site-packages/nltk/corpus/reader/wordnet.py", line 1137, in _load_lemma_pos_offset_map
    self._lemma_pos_offset_map[lemma][pos] = synset_offsets
MemoryError

1 个答案:

答案 0 :(得分:0)

来自Python docs

  

异常MemoryError:当操作耗尽内存时引发...

如果您认为自己仍然拥有大量可用内存,那么很可能您运行的是32位Python并且达到了2或3GB的限制。如果可能,请使用64位Python。请参阅Why Python Memory Error with list append() lots of RAM left