如何在Pandas Dataframe列上应用词干
am使用此功能进行词根提取,在字符串上完美运行
xx='kenichan dived times ball managed save 50 rest'
def make_to_base(x):
x_list = []
doc = nlp(x)
for token in doc:
lemma=str(token.lemma_)
if lemma=='-PRON-' or lemma=='be':
lemma=token.text
x_list.append(lemma)
print(" ".join(x_list))
make_to_base(xx)
但是当我在我的pandas dataframe列上应用此功能时,它既不工作也不给出任何错误
x = list(df['text']) #my df column
x = str(x)#converting into string otherwise it is giving error
make_to_base(x)
我尝试了不同的方法,但没有任何效果。像这样
df["texts"] = df.text.apply(lambda x: make_to_base(x))
make_to_base(df['text'])
我的数据集如下:
df['text'].head()
Out[17]:
0 Hope you are having a good week. Just checking in
1 K..give back my thanks.
2 Am also doing in cbe only. But have to pay.
3 complimentary 4 STAR Ibiza Holiday or £10,000 ...
4 okmail: Dear Dave this is your final notice to...
Name: text, dtype: object
答案 0 :(得分:1)
您需要实际返回在make_to_base
方法中获得的值,使用
def make_to_base(x):
x_list = []
for token in nlp(x):
lemma=str(token.lemma_)
if lemma=='-PRON-' or lemma=='be':
lemma=token.text
x_list.append(lemma)
return " ".join(x_list)
然后使用
df['texts'] = df['text'].apply(lambda x: make_to_base(x))