Question

我确信对于经验丰富的程序员而言，这比我更容易选择，但这个问题困扰着我，我做了几次失败的尝试，所以我想看看其他人可能会来用。

我有大约一百个看起来像这样的字符串：

(argument1 OR argument2) | inputlookup my_lookup.csv | `macro1(tag,bunit)` | `macro2(category)` | `macro_3(tag,\"expected\",category)` | `macro4(tag,\"timesync\")`

目标是找到宏函数的参数并用参数计数替换它们，以便最终输出如下所示：

(argument1 OR argument2) | inputlookup my_lookup.csv | `macro1(2)` | `macro2(1)` | `macro_3(3)` | `macro4(2)`

Python有办法获得我需要的计数（我只是在计算字符串中的逗号数量并加1），Python有很多用于内联字符串替换的正则表达式解决方案，但对于我的生活我无法弄清楚如何将它们组合起来。

似乎re.sub之类的东西不允许我识别子字符串，计算子字符串中逗号的数量，然后用该值替换子字符串（除非我在文档中遗漏了某些内容）。

有人可以想办法吗？我错过了一些明显的东西吗？

Answer 1

解决方案：

import re

def count_commas(input_str):
    c = 0
    for s in input_str:
        if s == ',':
            c += 1
    return c

pattern = r'\([A-Za-z0-9,""]+\)'
original_str = '(argument1 OR argument2) | inputlookup my_lookup.csv | `macro1(tag,bunit)` | `macro2(category)` | `macro_3(tag,\"expected\",category)` | `macro4(tag,\"timesync\")`'

matches = re.findall(pattern, original_str)

for match in matches:
    comma_count = count_commas(match) + 1
    match = match.replace('(', '\(').replace(')', '\)')
    original_str = re.sub(r'' + match, '(' + str(comma_count) + ')', original_str)

print (original_str)

说明：

pattern：＆＃34; \（[A-Za-z0-9，＆＃34;＆＃34;] + \）＆＃34; - 反斜杠以逃避特殊字符＆＃39;（＆＃39;和＆＃39;）＆＃39;在正则表达式，然后我正在寻找字母数字，逗号和引文（在方括号中），其后跟＆＃39; +＆＃39;这意味着方括号中的这些符号重复一次或多次 matches：找到的所有匹配项的列表。例如 - (tag,bunit)

然后，我循环查看所有匹配项以找到匹配中的逗号数量，然后替换＆＃39;（＆＃39;与＃39; \＃＆＃39;和＃39; ;）＆＃39;与＆＃39; \）＆＃39;以便在正则表达式中逃脱最后，在循环的最后一行，我使用re.sub将匹配的字符串替换为原始字符串中的逗号计数。

Python中的棘手字符串规范化

1 个答案: