我有一堆具有以下形式的字符串,其中X
表示任意单词
This is a string ((X.address)) test
This is a string ((X address)) test
This is a string (X address) test
This is a string (X.address) test
一旦找到X.address
或X address
(包括前面的括号),我想删除字符串的所有内容,并产生
This is a string
This is a string
This is a string
This is a string
这是我的出发点:
regex = r"\(X.address"
s = "This is a string ((X.address)) test"
re.split(regex, s)[0]
>> 'This is a string ('
它可以工作,但是我需要对其进行概括,以便它搜索任意单词而不是X
,并且要考虑到单词前面有1个或多个括号。
答案 0 :(得分:2)
您可以.+(?=\s\(+X(?:\.|\s)address)
说明:
.+
-匹配任意一个或多个字符
(?=...)
-前瞻性
\s
-空格
\(+
-匹配一个或多个(
X
-从字面上匹配X
(?:...)
-非捕获组
\.|\s
-匹配点.
或空格
address
-从字面上匹配address
答案 1 :(得分:2)
您可以使用
re.sub(r'\s*\(+[^()]*\baddress.*', '', s, flags=re.S)
详细信息
\s*
-超过0个空格\(+
-1个以上的(
字符[^()]*
-除(
和)
以外的任何0+个字符\b
-单词边界(address
之前不能带有其他字母,数字或下划线)address
-一个单词.*
-字符串末尾的任何0+个字符。请参见Python demo:
import re
strs = [ 'This is a string ((X.address)) test', 'This is a string ((X address)) test', 'This is a string (X address) test', 'This is a string (X.address) test', 'This is a string ((X and Y and Z address)) test' ]
for s in strs:
print(s, '=>', re.sub(r'\s*\(+[^()]*\baddress.*', '', s, flags=re.S))
输出:
This is a string ((X.address)) test => This is a string
This is a string ((X address)) test => This is a string
This is a string (X address) test => This is a string
This is a string (X.address) test => This is a string
This is a string ((X and Y and Z address)) test => This is a string
答案 2 :(得分:0)
使用
regex = r"(This is a string)\s+\(+.+\)"
s = "This is a string ((X.address)) test"
re.split(regex, s)[1]