Question

这是输入文件的内容。

 sb.txt
 JOHN:ENGINEER:35

这些是用于评估文件的模式。

finp = open(r'C:\Users\dhiwakarr\PycharmProjects\BasicConcepts\sb.txt','r')

for line in finp:
   biodata1 = re.search(r'([\w\W])+?:([\w\W])+?:([\w\W])+?',line)
   biodata2 = re.search(r'([\w\W]+?):([\w\W]+?):([\w\W]+?)',line)
   print('line is '+line)
   print('re.search(r([\w\W])+?:([\w\W])+?:([\w\W])+? '+biodata1.group(1)+' '+biodata1.group(2)+' '+biodata1.group(3))
   print('re.search(r([\w\W]+?):([\w\W]+?):([\w\W]+?) '+biodata2.group(1)+' '+biodata2.group(2)+' '+biodata2.group(3))

这是我得到的输出

line is JOHN:ENGINEER:35
re.search(r([\w\W])+?:([\w\W])+?:([\w\W])+? N R 3
re.search(r([\w\W]+?):([\w\W]+?):([\w\W]+?) JOHN ENGINEER 3

我对它产生的输出有几个问题。

为什么第一个搜索模式与JOHN，ENGINEER的最后一个字符相匹配，但是匹配35的第一个字符？我期待贪婪的角色“？”在找到JOHN和ENGINEER的第一个角色后立即退出。
有人可以帮我理解“+？”的位置影响输出要么是声明吗？

Answer 1

biodata1和biodata2之间的区别是parenthesis

的位置

biodata1：

([\w\W])+?:([\w\W])+?:([\w\W])+?

<强>解释

The parenthesis matches one rgument before : for group(1)
like wise for group(2)
But there is no ending criteria for group(3) so it matched the first letter 3 after :

biodata2：

([\w\W]+?):([\w\W]+?):([\w\W]+?)

<强>解释

You are matching all the words and non-words before : whicj should atleast have 1 words for group(1)
like wise for group(2)
but for group(3) you are matching all the word and non-word after second:

<强> + ?:

This checks if there is at least one or more character matching the given regex if so match it

使用贪婪组的Python正则表达式

1 个答案: