我正在尝试根据词性对字符串进行词形排列,但在最后阶段,我遇到了错误。我的代码:
import nltk
from nltk.stem import *
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import wordnet
wordnet_lemmatizer = WordNetLemmatizer()
text = word_tokenize('People who help the blinging lights are the way of the future and are heading properly to their goals')
tagged = nltk.pos_tag(text)
def get_wordnet_pos(treebank_tag):
if treebank_tag.startswith('J'):
return wordnet.ADJ
elif treebank_tag.startswith('V'):
return wordnet.VERB
elif treebank_tag.startswith('N'):
return wordnet.NOUN
elif treebank_tag.startswith('R'):
return wordnet.ADV
else:
return ''
for word in tagged: print(wordnet_lemmatizer.lemmatize(word,pos='v'), end=" ")
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-40-afb22c78f770> in <module>()
----> 1 for word in tagged: print(wordnet_lemmatizer.lemmatize(word,pos='v'), end=" ")
E:\Miniconda3\envs\uol1\lib\site-packages\nltk\stem\wordnet.py in lemmatize(self, word, pos)
38
39 def lemmatize(self, word, pos=NOUN):
---> 40 lemmas = wordnet._morphy(word, pos)
41 return min(lemmas, key=len) if lemmas else word
42
E:\Miniconda3\envs\uol1\lib\site-packages\nltk\corpus\reader\wordnet.py in _morphy(self, form, pos)
1710
1711 # 1. Apply rules once to the input to get y1, y2, y3, etc.
-> 1712 forms = apply_rules([form])
1713
1714 # 2. Return all that are in the database (and check the original too)
E:\Miniconda3\envs\uol1\lib\site-packages\nltk\corpus\reader\wordnet.py in apply_rules(forms)
1690 def apply_rules(forms):
1691 return [form[:-len(old)] + new
-> 1692 for form in forms
1693 for old, new in substitutions
1694 if form.endswith(old)]
E:\Miniconda3\envs\uol1\lib\site-packages\nltk\corpus\reader\wordnet.py in <listcomp>(.0)
1692 for form in forms
1693 for old, new in substitutions
-> 1694 if form.endswith(old)]
1695
1696 def filter_forms(forms):
我希望能够同时根据每个单词的词性对该字符串进行词形变换。请帮忙。
答案 0 :(得分:1)
首先,尽量不要将顶级,绝对和相对导入混合起来:
python setup.py bdist_rpm
这会更好:
import nltk
from nltk.stem import *
from nltk import pos_tag, word_tokenize
(见Absolute vs. explicit relative import of Python module)
您获得的错误很可能是因为您输入from nltk import sent_tokenize, word_tokenize
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet as wn
的输出作为pos_tag
的输入,即:
WordNetLemmatizer.lemmatize()
>>> from nltk import pos_tag
>>> from nltk.stem import WordNetLemmatizer
>>> wnl = WordNetLemmatizer()
>>> sent = 'People who help the blinging lights are the way of the future and are heading properly to their goals'.split()
>>> pos_tag(sent)
[('People', 'NNS'), ('who', 'WP'), ('help', 'VBP'), ('the', 'DT'), ('blinging', 'NN'), ('lights', 'NNS'), ('are', 'VBP'), ('the', 'DT'), ('way', 'NN'), ('of', 'IN'), ('the', 'DT'), ('future', 'NN'), ('and', 'CC'), ('are', 'VBP'), ('heading', 'VBG'), ('properly', 'RB'), ('to', 'TO'), ('their', 'PRP$'), ('goals', 'NNS')]
>>> pos_tag(sent)[0]
('People', 'NNS')
>>> first_word = pos_tag(sent)[0]
>>> wnl.lemmatize(first_word)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/nltk/stem/wordnet.py", line 40, in lemmatize
lemmas = wordnet._morphy(word, pos)
File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/reader/wordnet.py", line 1712, in _morphy
forms = apply_rules([form])
File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/reader/wordnet.py", line 1694, in apply_rules
if form.endswith(old)]
AttributeError: 'tuple' object has no attribute 'endswith'
的输入应该是WordNetLemmatizer.lemmatize()
而不是元组,所以如果你这样做:
str
或者,如果您喜欢简单的出路:
>>> tagged_sent = pos_tag(sent)
>>> def penn2morphy(penntag, returnNone=False):
... morphy_tag = {'NN':wn.NOUN, 'JJ':wn.ADJ,
... 'VB':wn.VERB, 'RB':wn.ADV}
... try:
... return morphy_tag[penntag[:2]]
... except:
... return None if returnNone else ''
...
>>> for word, tag in tagged_sent:
... wntag = penn2morphy(tag)
... if wntag:
... print wnl.lemmatize(word, pos=wntag)
... else:
... print word
...
People
who
help
the
blinging
light
be
the
way
of
the
future
and
be
head
properly
to
their
goal
然后:
pip install pywsd