Python - re.sub冗余的代码片段被替换为一个函数

时间:2014-08-07 23:11:28

标签: python regex function

我有一个Python脚本,需要使用regex(re.sub)替换字符串,这完全符合预期。

因此,在我的代码中多次调用同一段代码(因为这个操作在多个部分中是必需的),我想通过一个简单的函数来简化代码。

以下是re.sub操作:

# Replace bank char followed by % 
line=re.sub(r'\s\%','_PCT',line)

# Replace % at beginning of a word
line=re.sub(r'(?<=[a-zA-Z0-9\,])%(?=[a-zA-Z0-9]+|$)','PCT',line)

# Replace any other % 
line=re.sub(r'\%','_PCT',line)

# Replace blank space between 2 groups of chars
line=re.sub(r'(?<=[a-zA-Z0-9]) (?=[a-zA-Z0-9]+|$)','_',line)

# Replace +
line=re.sub(r'\+','',line)

# Replace "(" by "_"
line=re.sub(r'\(','_',line)

# Replace ")" by nothing
line=re.sub(r'\)','',line)

# Replace =0 by nothing
line=re.sub(r'\=0','',line)

来自stdin的“数据”的“行”定义:

# Read nmon data from stdin
data = sys.stdin.readlines()

for line in data:

...

我试图创建一个没有参数的简单函数,以便在需要时稍后在我的代码中调用它,但这不起作用,因为变量仅在函数上下文中有效。

使用简单的函数替换冗余的代码是否有任何简单的方法?

如果有帮助,可以在此处看到完整代码: http://pastebin.com/D9gb1B4V

编辑:尝试了一段代码:

好的感谢评论,我试过这个没有运气:

def subreplace(line):

    # Replace bank char followed by % 
    line=re.sub(r'\s\%','_PCT',line)

    # Replace % at beginning of a word
    line=re.sub(r'(?<=[a-zA-Z0-9\,])%(?=[a-zA-Z0-9]+|$)','PCT',line)

    # Replace any other % 
    line=re.sub(r'\%','_PCT',line)

    # Replace blank space between 2 groups of chars
    line=re.sub(r'(?<=[a-zA-Z0-9]) (?=[a-zA-Z0-9]+|$)','_',line)

    # Replace +
    line=re.sub(r'\+','',line)

    # Replace "(" by "_"
    line=re.sub(r'\(','_',line)

    # Replace ")" by nothing
    line=re.sub(r'\)','',line)

    # Replace =0 by nothing
    line=re.sub(r'\=0','',line)

    return line

然后:

            # Replace trouble strings
            subreplace(line)

我通常会执行代码

1 个答案:

答案 0 :(得分:0)

感谢我对Python年轻经验的评论和帮助,答案是:

def subreplace(line):

# Replace bank char followed by % 
line=re.sub(r'\s\%','_PCT',line)

# Replace % at beginning of a word
line=re.sub(r'(?<=[a-zA-Z0-9\,])%(?=[a-zA-Z0-9]+|$)','PCT',line)

# Replace any other % 
line=re.sub(r'\%','_PCT',line)

# Replace blank space between 2 groups of chars
line=re.sub(r'(?<=[a-zA-Z0-9]) (?=[a-zA-Z0-9]+|$)','_',line)

# Replace +
line=re.sub(r'\+','',line)

# Replace "(" by "_"
line=re.sub(r'\(','_',line)

# Replace ")" by nothing
line=re.sub(r'\)','',line)

# Replace =0 by nothing
line=re.sub(r'\=0','',line)

return line

并调用函数:

                # Replace trouble strings
                line = subreplace(line)