想要一个reg ex表达式来找到(RE.FINDALL)带有两个元音的单词[aeiou]每隔一个(MULTILINE)字符串的每个第四个单词?

时间:2018-01-09 19:50:34

标签: python regex string regex-group

 textDoc= 
 Line 1 "'"Here is my Text which 
 Line 2 I now have *starred* the words which i would like accounted for"
 Line 3 I would like the end result to be Lines 3 Words 6.
 Line 4.Python Regular expression **rules** have me trying things that 
 Line 5.I have listed below. All of them are usable but I would like 
 Line 6 To understand how to customize it for **production** to use."""

//desiredoutput = Lines 3, Words 3 
/* This is because the words: starred, rules, and production are on every 
   other line and they contain more than 2 vowels all while being the fourth 
   word on the line.*/

我似乎无法将它们全部放在一起,但是我正在考虑的一些正则表达式代码到目前为止一点点都是:

enumerate, .split. find.All 

[aeiou],[aeiou]{2},



 textDoc = 
numOfLines = len(textDoc.splitlines())
print(numOfLines)

split将单词列表添加到字符串中。我的猜测是我需要一个新的字符串,每隔一行写一个第四个单词,然后才算出它来完成我想要的第3行单词3

1 个答案:

答案 0 :(得分:0)

我认为在python中使用一个正则表达式无法完成,因为回溯,以下解决方案适用于perl正则表达式因为使用控制动词(*SKIP)

(?:[a-z]+(?:(?!\n)[^a-z])+){3}((?=(?:(?![aeiou])[a-z])*[aeiou](?:(?![aeiou])[a-z])*[^a-z](*SKIP)(?!)|)[a-z]+)

regex101 link

与python最接近因为回溯而无效

(?:[a-z]+(?:(?!\n)[^a-z])+){3}((?=(?:(?:(?![aeiou])[a-z])*[aeiou]){2})[a-z]+)

regex101 link