我创建了一个名为结果的数据框。每个单元格中包含带有名称的长文本。我需要确定名字。我使用了以下代码:
import nltk
for col in result:
for i in range(result.shape[0]):
print(result[col][i])
if not result[col][i]:
continue
else:
entities = nltk.ne_chunk(result[col][i])
for entity in entities:
print(entity)
if type(entity) == nltk.tree.Tree:
print(entity)
我得到的错误是
ERROR:root:message
Traceback (most recent call last):
File "<ipython-input-99-8b2514840dbb>", line 10, in <module>
entities = nltk.ne_chunk(result[col][1])
File "C:\Users\II00083764\AppData\Local\Continuum\anaconda3\lib\site-
packages\nltk\chunk\__init__.py", line 186, in ne_chunk
return chunker.parse(tagged_tokens)
File "C:\Users\II00083764\AppData\Local\Continuum\anaconda3\lib\site-
packages\nltk\chunk\named_entity.py", line 128, in parse
tagged = self._tagger.tag(tokens)
File "C:\Users\II00083764\AppData\Local\Continuum\anaconda3\lib\site-
packages\nltk\tag\sequential.py", line 64, in tag
tags.append(self.tag_one(tokens, i, tags))
File "C:\Users\II00083764\AppData\Local\Continuum\anaconda3\lib\site-
packages\nltk\tag\sequential.py", line 84, in tag_one
tag = tagger.choose_tag(tokens, index, history)
File "C:\Users\II00083764\AppData\Local\Continuum\anaconda3\lib\site-
packages\nltk\tag\sequential.py", line 652, in choose_tag
featureset = self.feature_detector(tokens, index, history)
File "C:\Users\II00083764\AppData\Local\Continuum\anaconda3\lib\site-
packages\nltk\tag\sequential.py", line 699, in feature_detector
return self._feature_detector(tokens, index, history)
File "C:\Users\II00083764\AppData\Local\Continuum\anaconda3\lib\site-
packages\nltk\chunk\named_entity.py", line 59, in _feature_detector
pos = simplify_pos(tokens[index][1])
IndexError: string index out of range
有人可以帮我吗