Question

如何使用另一列和POS标签替换句子中的字符串？

我想将col2中的字符串替换为col1

中的POS标记

例如：

col1    col2                           output
mtmb2   MTMB2 is a my sentence         NNP is a my sentence
mmm2    Your MmM2 is my sentence       Your NNP is my sentence
bbb2    Your sentence is bbb2          Your sentence is NN

我尝试使用@YOLO解决方案：

## import libraries
from nltk import word_tokenize, pos_tag, pos_tag_sents

## tag the sentece
df['col2'] = df['col2'].apply(word_tokenize).apply(pos_tag)

## this function does the magic 
def get_vals(lst):
    op = [] 
    for i, v in enumerate(lst):
        if i == 0:
            op.append(v[1])
        else:
            op.append(v[0])
    return ' '.join(op)

## apply the function
df['col2'] = df['col2'].apply(get_vals)

print(df)

   col1                      col2
0  aaa1     NNP is a great friend
1  abb2  NN is a very good friend

但是这个解决方案只有当要替换的单词在第一个索引上才有效...

使用另一列和POS标记替换句子中的字符串

0 个答案: