Question

我有问题如何使用我的函数替换句子中0以外的其他位置的字符串。

我想将col2中的字符串替换为col1中的字符串（总是小写）

例如，我想尝试替换：

input: Hello Aaa1 my very good friend
output: Hello NNP my very good friend

现在我只有：

input: Aaa1 my very good friend
output: NNP my very good friend

我想在句子的所有位置替换字符串。

我试试：

## import libraries
from nltk import word_tokenize, pos_tag, pos_tag_sents

## tag the sentece
df['col2'] = df['col2'].apply(word_tokenize).apply(pos_tag)

## this function does the magic 
def get_vals(lst):
    op = [] 
    for i, v in enumerate(lst):
        if i == 0:
            op.append(v[1])
        else:
            op.append(v[0])
    return ' '.join(op)

## apply the function
df['col2'] = df['col2'].apply(get_vals)

print(df)

   col1                      col2
0  aaa1     NNP is a great friend
1  abb2  NN is a very good friend

编辑：

我有：

col1           col2                output
aaa1          AAA1 Hello hello     NNP Hello hello
aaa2          aaa2 hello hello     NN hello hello
aaa3          Hello AAa3 hello     Hello NNP hello

我想在每一行中替换特定的POS标签（不仅仅是NNP的一个字符串）

Answer 1

使用re

import re
inp = "Hello Aaa1 my very good friend"
output = "Hello NNP my very good friend"

re.sub("Aaa1", "NNP", output)

使用pandas

import pandas as pd
df = pd.DataFrame(data={"col": [inp]})
df["col"].str.replace("Aaa1", "NNP")

如何用另一列替换句子中不同位置的字符串

1 个答案: