gensim lemmatize错误生成器引发StopIteration

时间:2019-12-11 08:17:32

标签: python nlp gensim lemmatization

我正在尝试执行简单的代码来对字符串进行lemmatize,但是关于迭代存在错误。 我找到了一些有关重新安装web.py的解决方案,但这对我不起作用。

python代码

from gensim.utils import lemmatize
lemmatize("gone")

错误是

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
I:\Anaconda\lib\site-packages\pattern\text\__init__.py in _read(path, encoding, comment)
    608             yield line
--> 609     raise StopIteration
    610 

StopIteration: 

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
<ipython-input-4-9daceee1900f> in <module>
      1 from gensim.utils import lemmatize
----> 2 lemmatize("gone")

-------------------------------------------------------------------------------------

I:\Anaconda\lib\site-packages\pattern\text\__init__.py in <genexpr>(.0)
    623     def load(self):
    624         # Arnold NNP x
--> 625         dict.update(self, (x.split(" ")[:2] for x in _read(self._path) if len(x.split(" ")) > 1))
    626 
    627 #--- FREQUENCY -------------------------------------------------------------------------------------

RuntimeError: generator raised StopIteration

1 个答案:

答案 0 :(得分:1)

该错误消息具有误导性-当没有任何内容可以正确定形时就会发生。

默认情况下,lemmatize()仅接受文字标签NN|VB|JJ|RB。传递与任何字符串匹配的正则表达式以更改此值:

>>> import re
>>> lemmatize("gone", allowed_tags=re.compile('.*'))
[b'go/VB']