背景
我有以下示例df
import pandas as pd
df = pd.DataFrame({'Text' : ['Jon J Smith is Here from **PHI** until **PHI**',
'No P_Name Found here',
'Jane Ann Doe is Also here until **PHI** ',
'**PHI** was **PHI** Tom Tucker is Not here but **PHI** '],
'P_ID': [1,2,3,4],
'P_Name' : ['Smith, Jon J', 'Rider, Mary', 'Doe, Jane Ann', 'Tucker, Tom'],
'N_ID' : ['A1', 'A2', 'A3', 'A4']
})
#rearrange columns
df = df[['Text','N_ID', 'P_ID', 'P_Name']]
df
Text N_ID P_ID P_Name
0 Jon J Smith is Here from **PHI** until **PHI** A1 1 Smith, Jon J
1 No P_Name Found here A2 2 Rider, Mary
2 Jane Ann Doe is Also here until **PHI** A3 3 Doe, Jane Ann
3 **PHI** was **PHI** Tom Tucker is Not here but A4 4 Tucker, Tom
目标
1)在Text
列中,将**PHI**
添加到与Jon J Smith
中找到的值相对应的值(例如P_Name
)
所需的输出
Text N_ID P_ID P_Name
0 **PHI** is Here from **PHI** until **PHI** A1 1 Smith, Jon J
1 No P_Name Found here A2 2 Rider, Mary
2 **PHI** is Also here until **PHI** A3 3 Doe, Jane Ann
3 **PHI** was **PHI** **PHI** is Not here but A4 4 Tucker, Tom
所需的输出可以出现在同一Text
列中,也可以生成new_col
问题
如何实现所需的输出?
答案 0 :(得分:2)
一种方法:
>>> df['Text'].replace(df['P_Name'].str.split(', *').apply(lambda l: ' '.join(l[::-1])),'**PHI**',regex=True)
0 **PHI** is here from **PHI** until **PHI**
1 No P_Name found here
2 **PHI** is also here until **PHI**
3 **PHI** was **PHI** **PHI** is not here but **...
您可以使用replace=True
来执行此操作,或者使用上面的df['new_col']=
创建一个新列。这样做是将P_name
列拆分,以空格将其反向连接,然后将其替换为Text
列。