有没有办法正确删除单词中的时态或复数?

时间:2015-04-12 14:06:05

标签: python nltk

是否有可能使用nltk改变跑步,帮助,烹饪,发现和愉快地运行,帮助,烹饪,寻找和快乐等词语?

2 个答案:

答案 0 :(得分:4)

nltk中实施了一些词干算法。看起来Lancaster词干分析算法对您有用。

>>> from nltk.stem.lancaster import LancasterStemmer
>>> st = LancasterStemmer()
>>> st.stem('happily')
'happy'
>>> st.stem('cooks')
'cook'
>>> st.stem('helping')
'help'
>>> st.stem('running')
'run'
>>> st.stem('finds')
'find'

答案 1 :(得分:3)

>>> from nltk.stem import WordNetLemmatizer
>>> wnl = WordNetLemmatizer()
>>> ls = ['running', 'helping', 'cooks', 'finds']
>>> [wnl.lemmatize(i) for i in ls]
['running', 'helping', u'cook', u'find']
>>> ls = [('running', 'v'), ('helping', 'v'), ('cooks', 'v'), ('finds','v')]
>>> [wnl.lemmatize(word, pos) for word, pos in ls]
[u'run', u'help', u'cook', u'find']
>>> ls = [('running', 'n'), ('helping', 'n'), ('cooks', 'n'), ('finds','n')]
>>> [wnl.lemmatize(word, pos) for word, pos in ls]
['running', 'helping', u'cook', u'find']

请参阅Porter Stemming of fried