我有一个包含许多不同字符串的列,我想要做的只是将我指定的所有字符串重命名为一个字符串,这样它们都具有相同的字符串。所以我的数据框看起来像这样:
My_strings
1 I bumped my knee because I fell
2 I fell off my bike but I had a helmet
3 I am alright I just need to be alert
4 If I fall I will get back up
所以在我的专栏My_strings中说我想查找包含特定单词的句子。
df.loc[df.T_L_DESC.str.contains("fell|fall|fallen", na=False), 'Slippery'] = df.T_L_DESC
我正在寻找的具体单词“跌倒”一旦在我的专栏的句子中找到这些单词,它们就被分成另一个名为“Slip_Fall”的栏目
我想将包含这些单词的所有字符串重命名为一个特定的字符串。有一点需要注意,当我运行上面的代码时,它会使每个句子都不包含NaN中指定的单词所以我的最终数据框看起来像这样:
My_strings Slippery
1 I bumped my knee because I fell Life_Lessons
2 I fell off my bike but I had a helmet Life_Lessons
3 NaN NaN
4 If I fall I will get back up Life_Lessons
所以我不想将我在数据框中获得的NaN值显着改为Life_Lessons我只想将包含我的关键字的句子改为Life_Lessons
提前致谢
答案 0 :(得分:2)
一个简单的解决方案:
In [191]: df.loc[df.T_L_DESC.str.contains("fell|fall|fallen", na=False), 'Slippery'] = 'Life_Lessons'
In [192]: df
Out[192]:
T_L_DESC Slippery
0 I bumped my knee because I fell Life_Lessons
1 I fell off my bike but I had a helmet Life_Lessons
2 I am alright I just need to be alert NaN
3 If I fall I will get back up Life_Lessons
In [193]: df.loc[df.Slippery!='Life_Lessons', 'T_L_DESC'] = np.nan
In [194]: df
Out[194]:
T_L_DESC Slippery
0 I bumped my knee because I fell Life_Lessons
1 I fell off my bike but I had a helmet Life_Lessons
2 NaN NaN
3 If I fall I will get back up Life_Lessons