Python中使用RegEx替换可变模式

时间:2015-09-17 05:39:45

标签: python regex

我正在寻找Python中非常特殊的RegEx(或其他解决方案,性能接近)来替换模式,如下例所示:

...-1AG.,., should be transformed as ...G.,.,
..,-1A,.,., should be transformed as ..,,.,.,
...-2GTC,., should be transformed as ...C,.,
..,-2GT.,., should be transformed as ..,.,.,
...+3TAGT,, should be transformed as ...T,,
..,+3TAG.,. should be transformed as ..,.,.

基本上:

AnySymbol(不仅是点和逗号),后跟一个+/-符号,后跟一个字母数字(1..9),后跟几个字母,其数量取决于之前的数字,最后是AnySymbol (不仅是点和逗号),

应转换为:

AnySymbol(不仅是点和逗号)和AnySymbol(不仅是点和逗号)。

显然解决方案:String = re.sub(r'[\-\+]\d\w+', "", String)是不对的,如果我们有案例(...-1AG.,., should be transformed as ...G.,.,)。 到目前为止,我正在循环r'[\-\+]1\w', r'[\-\+]2\w\w', r'[\-\+]3\w\w\w' ... r'[\-\+]9\w\w\w\w\w\w\w\w\w',但我希望有更优雅的解决方案。有什么想法吗?

1 个答案:

答案 0 :(得分:3)

看一下这个工作演示。

x="""...-1AG.,., should be transformed as ...G.,.,
..,-1A,.,., should be transformed as ..,,.,.,
...-2GTC,., should be transformed as ...C,.,
..,-2GT.,., should be transformed as ..,.,.,
...+3TAGT,, should be transformed as ...T,,
..,+3TAG.,. should be transformed as ..,.,."""

def repl(matchobj):
    return matchobj.group(2)[int(matchobj.group(1)):]

print re.sub(r"[+-](\d+)([a-zA-Z]+)",repl,x)

您可以在re.sub中使用自己的功能进行customized替换。