我有一个包含句子的列表。
list = ["I'm hoping to go jogging", "I haven't eaten in a while","where is everybody going"]
我想将上面的列表变为lematize并用引理代替原始单词。
如何使用spacy进行操作?
我知道我可以在一个循环中打印引理,但我想要的是用lemmatized替换原始单词。
答案 0 :(得分:1)
这听起来像你在寻找:
import spacy
from spacy.en import English
parser = English()
list = ["I'm hoping to go jogging", "I haven't eaten in a while","where is everybody going",
"Hello, how are you? I'm doing good."]
lemmatized_list = []
for sentence in list:
tokens = parser(sentence)
lemmas = []
for tok in tokens:
if not tok.is_punct:
lemmas.append(tok.lemma_.lower().strip() if tok.lemma_ != "-PRON-" else tok.lower_)
lemmatized_phrase = ""
for l in lemmas:
lemmatized_phrase += l + " "
lemmatized_phrase = lemmatized_phrase[:-1]
lemmatized_list.append(lemmatized_phrase)
print (lemmatized_list)
>>> ['i be hop to go jogging', "i haven't eat in a while", 'where be everybody go', 'hello how be you i be do good']