Question

我想要匹配'ofekkk'，'goooo'，'carrrrr'等所有单词然后使用子词进行重复，意味着我需要'ofe'，'g'，'car。我试过了

reg3 = re.compile(r'([a-z]+)([a-z])\2+')
reg3.findall('ofekkkk')

Out[90]: [('ofekk', 'k')]

但是第一组 - 'ofekk'并不是我想要的

由于

Answer 1

使用字词边界\b：

reg3 = re.compile(r'\b([a-z]*?)([a-z])\2+\b', re.I)

并过滤每个结果的第二组：

res = [m[0] for m in reg3.findall(yourstring)]

Answer 2

您只需要将贪婪量词替换为 lazy 版本：

reg3 = re.compile(r'([a-z]+?)([a-z])\2+')
                          ^^

如果您需要允许第一组可以为空的情况，请使用*?代替+?。

由于+? / *?量词是懒惰的，第一个捕获组将包含尽可能少的符号，正则表达式引擎将跳过该模式并尝试匹配([a-z])首先使用\2+，只有在不匹配时，([a-z]+?)才会“展开”。

Answer 3

我认为您的问题源于使用maximally- 贪心 +运算符。在允许模式的其余部分匹配之前，+?消耗尽可能多的文本。

相反，请尝试使用不情愿的版本import re text = 'dog cow ofekkkk goooo carrrr' #pat = r'\b([a-z]+?)([a-z])\2+' pat = r'\b([a-z]+?)([a-z])\2*\b' reg3 = re.compile(pat) for m in reg3.finditer(text): print(m.group(0), m.group(1), m.group(2))。这只会消耗为了进行匹配而消耗的东西：

$ python test.py
dog do g
cow co w
ofekkkk ofe k
goooo g o
carrrr ca r

输出是：

var set = this.CreateBindingSet<FirstView, FirstViewModel>();
set.Bind(recyclerView).For(v => v.ItemsSource).To(vm => vm.ViewModelItems);
set.Bind(recyclerView).For(v => v.ItemClick).To(vm => vm.ShowItemCommand);
set.Apply();

Answer 4

如果您愿意，可以使用贪婪量词。

\b(?:([a-z])(?!\1))+(?=([a-z])\2)

扩展

 \b    
 (?:
      ( [a-z] )                     # (1)
      (?! \1 )
 )+
 (?=
      ( [a-z] )                     # (2)
      \2 
 )

贪婪的方式会快30％ -

Regex1:   \b(?:([a-z])(?!\1))+(?=([a-z])\2)
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   3
Elapsed Time:    2.74 s,   2740.46 ms,   2740459 µs


Regex2:   ([a-z]+?)([a-z])\2+
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   3
Elapsed Time:    3.64 s,   3639.44 ms,   3639438 µs

找到重复最后一个字符的模式

4 个答案: