我试图弄清楚如何修改现有的正则表达式模式或创建一个新的模式,以获取其中包含dear
的所有行。如果匹配,则脚本应打印从:
到这些行末尾的所有行。此处不能选择字符串操作来获取结果。
我尝试过:
import re
instr = """
Expression: It's been a while man.
Expression: How have you been moron?
Expression: Good to see you dear.
Greeting: How is everything dear?
Greeting: Hi dear, how are you?
"""
pattern = r'.*(?<=dear)'
for item in instr.splitlines():
if re.search(pattern, item):
print(item)
我得到的结果:
Expression: Good to see you dear.
Greeting: How is everything dear?
Greeting: Hi dear, how are you?
我希望得到的东西:
Good to see you dear.
How is everything dear?
Hi dear, how are you?
如何使用正则表达式获取自定义结果?
答案 0 :(得分:2)
您可以使用正数Lookbehind仅捕获冒号之后的内容。这样的事情应该起作用:
(?<=:).*\bdear\b.*
Demo 。
我使用了边界声明\b
一词来避免匹配“除气器”之类的东西。如果这不是您想要的行为,请随时删除它们。
答案 1 :(得分:2)
另一个选择可能是使用锚点^
和捕获组:
^[^:]*:\s*(.*\bdear\b.*)
说明
^
字符串的开头[^:]*
:Match 0+ times not
:, then
:`\s*
匹配0+次空白字符(
捕获组
.*\bdear\b.*
匹配单词边界和它左右两侧的任何字符之间的亲爱的)
关闭捕获组例如:
import re
instr = """
Expression: It's been a while man.
Expression: How have you been moron?
Expression: Good to see you dear.
Greeting: How is everything dear?
Greeting: Hi dear, how are you?
"""
pattern = r'^[^:]*:\s*(.*\bdear\b.*)'
for item in instr.splitlines():
res = re.search(pattern, item)
if res:
print(res.group(1))
结果
Good to see you dear.
How is everything dear?
Hi dear, how are you?
答案 2 :(得分:1)
>>> for m in re.finditer(r'^[^:]+:\s*(.*dear.*)', instr, flags=re.M):
... print(m[1])
...
Good to see you dear.
How is everything dear?
Hi dear, how are you?
re.finditer
遍历所有匹配项flags=re.M
,这样^
和$
的锚将匹配每行,而不是每个完整的字符串^[^:]+:\s*
覆盖从行首到:
的字符串以及可选的空格(.*dear.*)
如果包含dear
,则匹配该行的其余部分(请注意,.
默认情况下将不匹配换行符)m[1]
仅给出该部分,而不是整行
m.group(1)
答案 3 :(得分:1)
此表达式
(?=:.*\bdear\b):\s*(.*)
可以在这里工作。
该表达式在this demo的右上角进行了说明,如果您想进一步探索或修改它,在this link中,您可以逐步观察它如何与某些示例输入匹配步骤,如果您愿意的话。
re.findall
import re
regex = r"(?=:.*\bdear\b):\s*(.*)"
test_str = ("Expression: It's been a while man.\n"
"Expression: How have you been moron?\n"
"Expression: Good to see you dear.\n"
"Greeting: How is everything dear?\n"
"Greeting: Hi dear, how are you?\n"
"Greeting: Hi dear, how are you?\n"
"dear: Hi there, how are you?")
print(re.findall(regex, test_str))
re.finditer
import re
regex = r"(?=:.*\bdear\b):\s*(.*)"
test_str = ("Expression: It's been a while man.\n"
"Expression: How have you been moron?\n"
"Expression: Good to see you dear.\n"
"Greeting: How is everything dear?\n"
"Greeting: Hi dear, how are you?\n"
"Greeting: Hi dear, how are you?\n"
"dear: Hi there, how are you?")
matches = re.finditer(regex, test_str, re.MULTILINE)
for matchNum, match in enumerate(matches, start=1):
print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
for groupNum in range(0, len(match.groups())):
groupNum = groupNum + 1
print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))