python将列中的字符串重命名为指定的字符串

时间:2017-04-28 14:38:47

标签: python pandas dataframe col

我有一个包含许多不同字符串的列,我想要做的只是将我指定的所有字符串重命名为一个字符串,这样它们都具有相同的字符串。所以我的数据框看起来像这样:

          My_strings
1   I bumped my knee because I fell
2   I fell off my bike but I had a helmet
3   I am alright I just need to be alert
4   If I fall I will get back up

所以在我的专栏My_strings中说我想查找包含特定单词的句子。

df.loc[df.T_L_DESC.str.contains("fell|fall|fallen", na=False), 'Slippery'] = df.T_L_DESC

我正在寻找的具体单词“跌倒”一旦在我的专栏的句子中找到这些单词,它们就被分成另一个名为“Slip_Fall”的栏目

我想将包含这些单词的所有字符串重命名为一个特定的字符串。有一点需要注意,当我运行上面的代码时,它会使每个句子都不包含NaN中指定的单词所以我的最终数据框看起来像这样:

            My_strings                           Slippery
1   I bumped my knee because I fell            Life_Lessons   
2   I fell off my bike but I had a helmet      Life_Lessons
3   NaN                                        NaN 
4   If I fall I will get back up               Life_Lessons

所以我不想将我在数据框中获得的NaN值显着改为Life_Lessons我只想将包含我的关键字的句子改为Life_Lessons

提前致谢

1 个答案:

答案 0 :(得分:2)

一个简单的解决方案:

In [191]: df.loc[df.T_L_DESC.str.contains("fell|fall|fallen", na=False), 'Slippery'] = 'Life_Lessons'

In [192]: df
Out[192]:
                                T_L_DESC      Slippery
0        I bumped my knee because I fell  Life_Lessons
1  I fell off my bike but I had a helmet  Life_Lessons
2   I am alright I just need to be alert           NaN
3           If I fall I will get back up  Life_Lessons

In [193]: df.loc[df.Slippery!='Life_Lessons', 'T_L_DESC'] = np.nan

In [194]: df
Out[194]:
                                T_L_DESC      Slippery
0        I bumped my knee because I fell  Life_Lessons
1  I fell off my bike but I had a helmet  Life_Lessons
2                                    NaN           NaN
3           If I fall I will get back up  Life_Lessons