Question

我有一个字符串

                  /* @TS 1 : This is the comment
                   * for method1
                   */
                  /* @TS 2 : This is the comment
                   * for method2
                   */

我需要将每个索引值（1,2）映射到该内容例如：1 - ＆gt;这是方法1的注释。

我使用正则表达式

编写了以下Python程序

/*commentLine contains the string of data */

regex = r"/\*([^*]|[\r\n]|(\*([^/]|[\r\n])))*\*/"
for match in re.finditer(regex, commentLine):
    if ':' in commentLine:
        commentLine = commentLine[commentLine.index(':') + 1:]
    commentgroup = match.group()
    if '@TS' in commentgroup:
        commentgroup = commentgroup.replace('@TS', '')
        print(commentgroup)

但在这里，我需要更换@TS，我需要找出数字所在的位置。是否有任何正则表达式将组编号作为一个组，并将（:)作为一个组后的内容？

编辑：预期结果：{＆＃39; 1＆＃39; ：＆＃39;这是方法1＆＃39;，＆＃39; 2＆＃39;的注释：＆＃39;这是方法2的评论＆＃39; }

Answer 1

您可以使用

import re
s = "/* @TS 1 : This is the comment\n* for method1\n*/\n/* @TS 2 : This is the comment\n* for method2\n*/"
rx = r'/\*+\s*@TS\s*(\d+)\s*:([^*]*\*+(?:[^/*][^*]*\*+)*/)'
d = {}
for match in re.finditer(rx, s):
    d[match.group(1)] = re.sub(r"(?:^|[\r\n]+)\s*\*\s*", "", match.group(2)[:-2].strip())

print(d) # => {'1': 'This is the commentfor method1', '2': 'This is the commentfor method2'}

请参阅Python demo

这里有几点需要注意。

模式详情

/\*+ - 在

/*

*

\s* - 0+ whitespaces
@TS - 文字子字符串
\s* - 0+ whitespaces
(\d+) - 第1组：一个或多个数字
\s*: - 0+空格和:
([^*]*\*+ - （第2组开始）：匹配*以外的0 +个字符，后跟1 +字面*
(?:[^/*][^*]*\*+)* - 0+序列：
- [^/*][^*]*\*+ - 不是/或*（与[^/*]匹配），后跟0 +非星号字符（[^*]*），后跟1 +星号（\*+）
/) - 关闭/（第2组结束）

请参阅regex demo

代码详情

使用d = {}定义空字典。然后，所有匹配都会找到re.finditer，match.group(1)是关键，而match.group(2)包含需要“修剪”一下的值。最后2个字符会被[:-2]删除（因为它们是*/），然后从空格（.strip()）中删除该值，然后使用(?:^|[\r\n]+)\s*\*\s*模式删除所有字符*在字符串/行的开头包含空格。

使用正则表达式

1 个答案: