Question

我有一个正则表达式捕获，我想从捕获的字符串的中间中排除一个字符（在这种特殊情况下是一个空格）。这可以通过修改正则表达式一步完成吗？

（快速和肮脏）示例：

Text: Key name = value
My regex: (.*) = (.*)
Output: \1 = "Key name" and \2 = "value"
Desired output: \1 = "Keyname" and \2 = "value"

更新：我不确定正则表达式引擎将运行此正则表达式，因为它是更大的软件产品的一部分。如果您有解决方案，请指定它将运行哪些引擎，哪些引擎不运行。

Update2：上述产品将正则表达式作为输入，然后进一步使用匹配的值，这就是要求一步解决方案的原因。没有机会在管道中插入中间处理步骤。

Answer 1

这是一个可能的理论纯正则表达式实现，使用前一个匹配的\G锚点：

/(?:\G(\w+)\h(?:(?:=\h)(\w+))?)+/g

Online demo

<强>勒亘

(?:           # Non capturing group 1
  \G          # Matches where the regex engine stops in the previous step
  (\w+)       # capture group 1: a regex word of 1+ chars
  \h*         # zero or more horizontal spaces (space, tabs)
  (?:         # Non capturing group 2
    =\h*      # literal '=' follower by zero or more hspaces
    (\w+)     # capture group 2: a regex word of 1+ chars
  )?          # make the non capturing group 2 optional
)+            # repeat the non capturing group 1, one or more

在演示的替换部分：

\1实际上包含Keyname（2个字词由假空格分隔）
\2是value

注意：除非确实需要，否则我不建议使用它（为什么？）。

在两个步骤中有多种可能的方法：正如已经说明的那样，只需从OP正则表达式的第一个捕获组中剥离空格。

Answer 2

我想出来......像：

(?<key>[\w]+)\s*=\s*(?<value>.+)
# look for a word character and capture it in a group called "key"
# followed by zero or unlimited times of a whitespace character (\s)
# followed by an equation sign
# followed by zero or unlimited times of a whitespace character (\s)
# capture the rest in a group called value

...然后处理捕获的输出。但是使用\w字符类不会匹配任何空格（你有空格的键吗？）。
查看working demo here。但正如评论中提到的，这取决于您的编程语言。

如何从正则表达式捕获组中排除字符？

2 个答案: