如何用另一列替换句子中不同位置的字符串

时间:2018-04-29 09:42:49

标签: python python-3.x pandas

我有问题如何使用我的函数替换句子中0以外的其他位置的字符串。

我想将col2中的字符串替换为col1中的字符串(总是小写)

例如,我想尝试替换:

input: Hello Aaa1 my very good friend
output: Hello NNP my very good friend

现在我只有:

input: Aaa1 my very good friend
output: NNP my very good friend

我想在句子的所有位置替换字符串。

我试试:

## import libraries
from nltk import word_tokenize, pos_tag, pos_tag_sents

## tag the sentece
df['col2'] = df['col2'].apply(word_tokenize).apply(pos_tag)

## this function does the magic 
def get_vals(lst):
    op = [] 
    for i, v in enumerate(lst):
        if i == 0:
            op.append(v[1])
        else:
            op.append(v[0])
    return ' '.join(op)

## apply the function
df['col2'] = df['col2'].apply(get_vals)

print(df)

   col1                      col2
0  aaa1     NNP is a great friend
1  abb2  NN is a very good friend

编辑:

我有:

col1           col2                output
aaa1          AAA1 Hello hello     NNP Hello hello
aaa2          aaa2 hello hello     NN hello hello
aaa3          Hello AAa3 hello     Hello NNP hello

我想在每一行中替换特定的POS标签(不仅仅是NNP的一个字符串)

1 个答案:

答案 0 :(得分:0)

使用re

import re
inp = "Hello Aaa1 my very good friend"
output = "Hello NNP my very good friend"

re.sub("Aaa1", "NNP", output)

使用pandas

import pandas as pd
df = pd.DataFrame(data={"col": [inp]})
df["col"].str.replace("Aaa1", "NNP")