Question

我认为这根本就不存在。但是我知道一些功能存在于其他正则表达式引擎中，我希望它们可能与此类似。

pattern = r"""
    ([a-zA-Z])    # Match a single letter and capture it as group1
    .*?           # random matches in between
    \1            # Match whatever capture group1 matched
"""

现在，它与AA，bb等匹配。到目前为止，在python中还算不错。现在，某些语言（如果python regex引擎支持，则为idk）允许

pattern = r"""
    ([a-zA-Z])    # Match a single letter and capture it as group1
    .*?           # random matches in between
    \U1           # Match group1 in upper case
"""

有一些类似的“功能”，可以让您稍微操纵上一个捕获组，但是它们与我在some regex website读到的内容非常有限。

现在我的问题是，是否有可能为正则表达式编写我们自己的“函数”以使其使用类似

@re.register_function('X')
def between_x(group):
    return f'X{group}X'

然后

pattern = r"""
    ([a-zA-Z]{2})    # Match a single letter and capture it as group1
    .*?              # random matches in between
    (\X1)            # Match if the previous group is inbetween Xes.
"""
# For example, AArandomletterXAAX would match and group1 would be AA
# and second group would be XAAX

不需要成为re模块，我对任何其他正则表达式引擎都开放。

作为示例，模式应匹配

string: "hello...HELLO"

不匹配

string: "hello...hello"

假设我们的功能是

def f(group):
    return group.upper()

Answer 1

这个问题很有趣，如果我正确理解的话，我相信它有很好的解决方案。

我们可以从具有三个子表达式的表达式开始：

([a-z]+)(.+?)((?=.+[A-Z].+)(?i:\1))

这是一个小写字母开头的单词：

([a-z]+)

其间跟随以下任何内容：

 (.+?)

如果我们真的想解决这个问题，这就是我们应该研究的小组：

((?=.+[A-Z].+)(?i:\1))

我们正在回引带有i标志，这很好用。

现在，它很可能会传递第一个捕获组中所有不区分大小写的字母，而使完全小写的第三组失败，我希望这就是这里所需要的。

如果不是，那么我们可能要重点关注((?=.+[A-Z].+)这个小组通过我们期望的第三组，使不希望的失败。

如何与上一组（包括不区分大小写）完全匹配？

1 个答案:

DEMO