删除数据帧中的一些字符串

时间:2018-12-06 09:33:51

标签: python pandas dataframe

我正在尝试删除以 System:

开头的数据框中的某些字符串

我的数据框:

       A          B                                                                C
  French      house   Blablabla System:Microsoft Windows XP; Browser:Chrome 32.0.1700;
 English      house               my address: 101-102 bd Charles de Gaulle 75001 Paris
  French  apartment                                                    my name is Liam
  French      house                                                       Hello George!
 English  apartment              System:Microsoft Windows XP; Browser:Chrome 32.0.1700;

我尝试过:

def remove_lines():

    df['C'] = df['C'].str.replace(r'(\s+)(System:).+','')

    return df

什么也没发生...

好的输出:

       A          B                                                                C
  French      house                                                          Blablabla 
 English      house               my address: 101-102 bd Charles de Gaulle 75001 Paris
  French  apartment                                                    my name is Liam
  French      house                                                       Hello George!
 English  apartment              

2 个答案:

答案 0 :(得分:3)

使用:

None

答案 1 :(得分:1)

您可以简单地在split上使用System函数,然后选择第一部分,就像这样:

In [1936]: df.C = pd.DataFrame(df.C.str.split('System').tolist())[0]
In [1937]: df
Out[1937]: 
         A          B                                                  C
0   French      house                                         Blablabla
1  English      house  my address: 101-102 bd Charles de Gaulle 75001...
2   French  apartment                                    my name is Liam
3   French      house                                      Hello George!
4  English  apartment