Question

我正在尝试过滤某些文本，在文本之间使用多余的字符。这是我要过滤的示例文本。

*CHI:\t<that> [/] (.) that (i)s it . [+ bch]\n

尝试：

import re
s = '*CHI:\t<that> [/] (.) that (i)s it . [+ bch]\n'
s = re.sub('[()]','',s)
print(s)

我的输出是

*CHI:   <that> [/] . that is it . [+ bch]

我想保留（。），但在括号之间过滤括号，即将（i）更改为i。我想保留[/]并删除[+ bch]。如何过滤一个并保留另一个？

Answer 1

您可以使用不包含.的字符类：

s = re.sub(r'\(([^.])\)', r'\1', s)

通过此更改，s将变为：

*CHI:   <that> [/] (.) that is it . [+ bch]

Answer 2

适用于两个Python版本的方法是

re.sub(r'\((?!\.\))|(?<!\(\.)\)', '', s)

请参见regex demo

详细信息

\((?!\.\))-一个(紧跟其后的.)
|-或
(?<!\(\.)\)-)之前没有紧跟(.。

作为替代方案，您可以将异常作为替代方案添加到捕获组中，并替换为反向引用（Python 3.5+）或lambda表达式（早期版本）：

import re
s = '*CHI:\t<that> [/] (.) that (i)s it . [+ bch]\n'
s = re.sub(r'(\(\.\))|[()]', r'\1', s)
# Python earlier than 3.5
# s = re.sub(r'(\(\.\))|[()]', lambda x: x.group(1) if x.group(1) else '', s)
print(s) # => *CHI: <that> [/] (.) that is it . [+ bch]

请参见Python 3.5 demo和this Python 2.x demo。

删除python中（）和[]之间的文本，但有一些例外

2 个答案: