从列数据框中的字符串中删除单词

时间:2020-05-20 20:57:21

标签: python string pandas dataframe

我有一个这样的数据框:

Num           Text 
1        15 March 2020 - There was...
2        15 March 2020 - There has been...
3        24 April 2018 - Nothing has ...
4        07 November 2014 - The Kooks....
...

我想从文本的每一行中删除前4个字(即15 March 2020 - , 15 March 2020 -, ...)。 我尝试过

df['Text']=df['Text'].str.replace(' ', ),但我不知道我应该在括号中包括什么以将这些值替换为空白(或什么都没有)。

5 个答案:

答案 0 :(得分:0)

您可以使用 str.split

考虑您的df为:

In [1193]: df = pd.DataFrame({'Num':[1,2,3,4], 'Text':['15 March 2020 - There was','15 March 2020 - There has been','24 April 2018 - Nothing has','07 November 2014 - The Kooks']})

In [1194]: df
Out[1194]: 
   Num                            Text
0    1       15 March 2020 - There was
1    2  15 March 2020 - There has been
2    3     24 April 2018 - Nothing has
3    4    07 November 2014 - The Kooks

In [1207]: df['Text'].str.split().str[4:].apply(' '.join)                                                                                                                                                
Out[1207]: 
0         There was
1    There has been
2       Nothing has
3         The Kooks
Name: Text, dtype: object

答案 1 :(得分:0)

可能有用的方法是使用split命令将其拆分为单词,然后使用[4:]

提取第四个单词之后的所有内容。

答案 2 :(得分:0)

Python可以实现不同的正则表达式,示例可能是四个单词str.replace("\d* \d* \d* \d*", ''),这里是link,以了解有关python正则表达式以及如何检测字符串中不同模式的更多信息。

答案 3 :(得分:0)

您将df.str.splitdf.str.slice一起使用。

df['test'].str.split(n=4).str[-1]

答案 4 :(得分:0)

即使不太优雅,我还是更喜欢将“ .find()”与“ .apply()”结合使用。无论发生什么“ .find”,第一个“-”都将用作分隔符。

let vs_group = 
[
  {
     "name": "V1_IC11",
     "value": "INBOARD_111_COUNT"
  },
  {
     "name": "V1_IC12",
     "value": "INBOARD_112_COUNT"
  } 
  ...
]

此:

t = pd.DataFrame({'Num':[1,2,3,4], 'Text':['15 March 2020 - There was','15 March 2020 - There has been','24 April 2018 - Nothing has','07 November 2014 - The Kooks']})

t["text2"] = t.apply(lambda x: x['Text'][str(x['Text']).find("- ")+2:], axis=1)

成为这个:

Num           Text 
1        15 March 2020 - There was...
2        15 March 2020 - There has been...
3        24 April 2018 - Nothing has ...
4        07 November 2014 - The Kooks....