如何使用正则表达式删除字符串中单词两侧出现的各种标签?

时间:2019-02-10 22:12:53

标签: python regex

strng= 'I have to get to the <Color:D010644> maroon <Color:D010644> building before noon'

如何使用正则表达式将此字符串转换为

strngNew =  'I have to get to the maroon building before noon'

之所以使这个棘手,是因为标签中的数字在行之间变化,并且数字的数量也变化。因此,在上面的示例中,数字为“ D010644”,但在另一行中可能为“ JJD93JD93J999333”。

因此,我需要正则表达式操作对所有这些变化都是通用的。

但是,标签名称(在上面的示例中为“颜色”)保持不变。

2 个答案:

答案 0 :(得分:2)

您可以使用正则表达式过滤掉标签:

import re

text = 'I have to get to the <Color:D010644> maroon <Color:D010644> building before noon'

result = re.sub(r'(<Color:)\w+(> )', '', text)

print(result)  # I have to get to the maroon building before noon

答案 1 :(得分:1)

您也可以使用标准库执行此操作:

str = 'I have to get to the <Color:D010644> maroon <Color:D010644> building before noon'
new_string = ' '.join([elem for elem in str.split(' ') if not elem.startswith('<Color')])
print(new_string)

>>> I have to get to the maroon building before noon