要匹配的对象

Question

可以成功找到字符串，但不能将匹配对象分成正确的组

完整的字符串如下：

 Technology libraries: Techlibhellohellohello

（全部一行）。我想做的是在文件中找到此行（起作用），但是当我想添加到字典中时，我只想添加“技术库”部分，而不要添加其他所有内容。我想使用.group（）并指定哪个组，但只有Techlibhellohellohello似乎作为group（1）弹出，而没有其他出现。此外，技术库之前还有空白

要匹配的对象

is_startline_1 = re.compile(r" Technology libraries: (.*)$")

匹配的行

startline1_match = is_startline_1.match(line)

添加到字典

bookmark_dict['context']        = startline1_match.group(1)

所需的输出用于.groups（1）或.groups（2）包含“技术库”

Answer 1

您可能只想用捕获组包装第一部分：

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(Technology libraries: )(.*)$"

test_str = "Technology libraries: Techlibhellohellohello"

subst = "\\1\\n\\2"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

此JavaScript演示演示了捕获组的工作方式：

const regex = /(Technology libraries: )(.*)$/gm;
const str = `Technology libraries: Techlibhellohellohello`;
const subst = `\n$1\n$2`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

RegEx

如果这不是您想要的表达式，则可以在regex101.com中修改/更改表达式。

 (Technology libraries: )(.*)

RegEx电路

您还可以在jex.im中可视化您的表达式：

如果您想删除:和空白，只需添加一个中间捕获组即可：

Demo

(Technology libraries)(:\s+)(.*)

Python代码

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(Technology libraries)(:\s+)(.*)"

test_str = ("Technology libraries: Techlibhellohellohello\n"
    "Technology libraries:     Techlibhellohellohello")

subst = "\\1\\n\\3"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

JavaScript演示

const regex = /(Technology libraries)(:\s+)(.*)/gm;
const str = `Technology libraries: Techlibhellohellohello
Technology libraries:     Techlibhellohellohello`;
const subst = `\n$1\n$3`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

如果您想捕获“技术库”之前的空格，只需将它们添加到捕获组中即可：

^(\s+)(Technology libraries)(:\s+)(.*)$

Demo

Python测试

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"^(\s+)(Technology libraries)(:\s+)(.*)$"

test_str = ("    Technology libraries: Techlibhellohellohello\n"
    "       Technology libraries:     Techlibhellohellohello")

subst = "\\2\\n\\4"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

JavaScript演示

const regex = /^(\s+)(Technology libraries)(:\s+)(.*)$/gm;
const str = `    Technology libraries: Techlibhellohellohello
       Technology libraries:     Techlibhellohellohello`;
const subst = `$2\n$4`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

如何使用正则表达式re.compile创建捕获组？

要匹配的对象

匹配的行

添加到字典

1 个答案:

RegEx

RegEx电路

Demo

Python代码

JavaScript演示

Demo

Python测试

JavaScript演示