我有一个像这样的数据框。
我正在尝试删除子字符串列中的字符串。
Main substring
Sri playnig well cricket cricket
sri went out NaN
Ram is in NaN
Ram went to UK,US UK,US
我的预期结果是,
Main substring
Sri playnig well cricket
sri went out NaN
Ram is in NaN
Ram went to UK,US
我尝试了df["Main"].str.reduce(df["substring"])
但没有工作,请帮忙。
答案 0 :(得分:2)
这个单行应该这样做:
df.loc[df['substring'].notnull(), 'Main'] = df.loc[df['substring'].notnull()].apply(lambda x: x['Main'].replace(x['substring'], ''), axis=1)
答案 1 :(得分:1)
这是使用pd.DataFrame.apply
的一种方式。请注意,np.nan == np.nan
的计算结果为False
,我们可以在函数中使用此技巧来确定何时应用删除逻辑。
import pandas as pd, numpy as np
df = pd.DataFrame({'Main': ['Sri playnig well cricket', 'sri went out',
'Ram is in' ,'Ram went to UK,US'],
'substring': ['cricket', np.nan, np.nan, 'UK,US']})
def remover(row):
sub = row['substring']
if sub != sub:
return row['Main']
else:
lst = row['Main'].split()
return ' '.join([i for i in lst if i!=sub])
df['Main'] = df.apply(remover, axis=1)
print(df)
Main substring
0 Sri playnig well cricket
1 sri went out NaN
2 Ram is in NaN
3 Ram went to UK,US