Question

我有一堆具有以下形式的字符串，其中X表示任意单词

This is a string ((X.address)) test
This is a string ((X address)) test
This is a string (X address) test
This is a string (X.address) test

一旦找到X.address或X address（包括前面的括号），我想删除字符串的所有内容，并产生

This is a string
This is a string
This is a string
This is a string

这是我的出发点：

regex = r"\(X.address"
s = "This is a string ((X.address)) test"
re.split(regex, s)[0]

>> 'This is a string ('

它可以工作，但是我需要对其进行概括，以便它搜索任意单词而不是X，并且要考虑到单词前面有1个或多个括号。

Answer 1

您可以.+(?=\s\(+X(?:\.|\s)address)

说明：

.+-匹配任意一个或多个字符

(?=...)-前瞻性

\s-空格

\(+-匹配一个或多个(

X-从字面上匹配X

(?:...)-非捕获组

\.|\s-匹配点.或空格

address-从字面上匹配address

Demo

Answer 2

您可以使用

re.sub(r'\s*\(+[^()]*\baddress.*', '', s, flags=re.S)

详细信息

\s*-超过0个空格
\(+-1个以上的(字符
[^()]*-除(和)以外的任何0+个字符
\b-单词边界（address之前不能带有其他字母，数字或下划线）
address-一个单词
.*-字符串末尾的任何0+个字符。

请参见Python demo：

import re
strs = [ 'This is a string ((X.address)) test', 'This is a string ((X address)) test', 'This is a string (X address) test', 'This is a string (X.address) test', 'This is a string ((X and Y and Z address)) test' ]
for s in strs:
    print(s, '=>', re.sub(r'\s*\(+[^()]*\baddress.*', '', s, flags=re.S))

输出：

This is a string ((X.address)) test => This is a string
This is a string ((X address)) test => This is a string
This is a string (X address) test => This is a string
This is a string (X.address) test => This is a string
This is a string ((X and Y and Z address)) test => This is a string

Answer 3

使用

regex = r"(This is a string)\s+\(+.+\)"
s = "This is a string ((X.address)) test"
re.split(regex, s)[1]

正则表达式：匹配任意单词前面的任意数量的括号

3 个答案: