我正在尝试使用Python RegEx re.sub
在一个单词的倒数第二个元音[aeiou]
之前删除冒号,如果倒数第二个元音(从结尾)前面有另一个元音。
所以冒号必须在第3个和第4个元音之间,从单词的结尾算起。
所以给出的第一个例子会像这样w4:32ny1h
分解。
we:aanyoh > weaanyoh # w4:32ny1h
hiru:atghigu > hiruatghigu
yo:ubeki > youbeki
以下是我正在尝试使用的RegEx声明,但我无法让它工作。
word = re.sub(ur"([aeiou]):([aeiou])(([^aeiou])*([aeiou])*([aeiou])([^aeiou])*([aeiou]))$", ur'\1\2\3\4', word)
答案 0 :(得分:1)
请问您的括号太多(以及其他额外的东西)?:
word = re.sub(ur"([aeiou]):(([aeiou][^aeiou]*){3})$", ur'\1\2', word)
答案 1 :(得分:1)
不确定是否要完全忽略辅音;这个正则表达式会。其他类似于杰夫的。
import re
tests = [
'we:aanyoh',
'hiru:atghigu',
'yo:ubeki',
'yo:ubekiki',
'yo:ubek'
]
for word in tests:
s = re.sub(r'([^aeiou]*[aeiou][^aeiou]*):((?:[^aeiou]*[aeiou]){3}[^aeiou]*)$', r'\1\2', word)
print '{} > {}'.format(word, s)
答案 2 :(得分:1)
你声明你的目标是一个单词而不是一行,所以首先设置锚点只处理单词:
\b[regex will go here]\b
^ ^ assert a word boundary
接下来,冒号继续前后跟[aeiou]
,并在冒号后面的部分再添加两个[aeiou]
。我假设案件独立?
(?i)(\b\w+[aeiou]):((?:[aeiou][^aeiou\s\W]*){3}\b)
^ match a character that is NOT vowel, space or not a
^ \W=[^a-zA-Z0-9_]
(注意使用[^aeiou\W]
是辅音字母,数字和_而不是其他字符Demo。)
Python演示:
import re
tests={
'matches':[
'we:aanyoh',
'hiru:atghigu',
'yo:ubeki'
],
'no match':[
'wz:ubeki',
'we:a anyoh',
'yo:ubek',
'hiru:atghiguu'
]
}
for k, v in tests.items():
print k
for e in v:
s=re.sub(r'(?i)(\b\w+[aeiou]):((?:[aeiou][^aeiou\s\W]*){3}\b)', r'\1\2', e)
print '\t{} > {}'.format(e, s)
打印:
matches
we:aanyoh > weaanyoh
hiru:atghigu > hiruatghigu
yo:ubeki > youbeki
no match
wz:ubeki > wz:ubeki
we:a anyoh > we:a anyoh
yo:ubek > yo:ubek
hiru:atghiguu > hire:atghiguu
这只会处理单个冒号的单词。如果要匹配具有多个冒号但具有相同模式的单词,请将LH模式更改为包含冒号和非\b
的锚的字符类。
示例:(?i)(^[\w:]+[aeiou]):((?:[aeiou][^aeiou\s\W]*){3}\b)
答案 3 :(得分:0)
它应该适用于此:
word = re.sub(ur"(?<=[aeiou]):(?=[aeiou]([^aeiou]*[aeiou]){2}[^aeiou]*$)", ur'', word)
请参阅此处的示例:https://regex101.com/r/kA8xH3/2
请注意,我只捕获冒号并用空字符串替换它,而不是捕获组并连接它们。
Tt检查结肠组合,然后进行预测以检查是否有2个额外的元音(可能是辅音)。它最后还允许额外的辅音,但确保$
答案 4 :(得分:0)
这样做:
word = re.sub(ur"([aeiou]):([aeiou])([^\Waeiou]*[aeiou][^\Waeiou]*[aeiou][^\Waeiou]*)$", ur'\1\2\3', word)
答案 5 :(得分:-1)
综述(我用一个大写元组来表示替换应该在哪个单词中出现)。如果您希望我添加其他测试字符串,请告诉我。
import re
strings = [
'wE:aanyoh',
'hirU:atghigu',
'yO:ubeki',
'xE:aaa',
'xx:aaa',
'xa:aaaxA:aaa',
'xa:aaaxA:aaaxx',
'xa:aaaxA:aaxax',
'a:aaaxA:aaxax',
'e:aeixA:aexix',
]
pattern = r"""
(
.*
[aeiou]
)
:
(
[aeiou]
.*?
[aeiou]
.*?
[aeiou]
)
"""
template = "{:>15}: {}"
for string in strings:
print(
template.format('original', string)
)
print(template.format('Alexander:',
re.sub(ur"(?<=[aeiou]):(?=[aeiou]([^aeiou]*[aeiou]){2}[^aeiou]*$)", ur'', string, flags=re.I)
))
print(template.format('lonut:',
re.sub(ur"([aeiou]):([aeiou])([^\Waeiou]*[aeiou][^\Waeiou]*[aeiou][^\Waeiou]*)$", ur'\1\2\3', string, flags=re.I)
))
print(template.format('Tom Zych:',
re.sub(r'([^aeiou]*[aeiou][^aeiou]*):((?:[^aeiou]*[aeiou]){3}[^aeiou]*)$', r'\1\2', string, flags=re.I)
))
print(template.format('Jeff Y:',
re.sub(ur"([aeiou]):(([aeiou][^aeiou]*){3})$", ur'\1\2', string, flags=re.I)
))
print(template.format('7stud:',
re.sub(pattern, r'\1\2', string, count=1, flags=re.X|re.I)
))
print("\n")
original: wE:aanyoh
Alexander:: wEaanyoh
lonut:: wEaanyoh
Tom Zych:: wEaanyoh
Jeff Y:: wEaanyoh
7stud:: wEaanyoh
original: hirU:atghigu
Alexander:: hirUatghigu
lonut:: hirUatghigu
Tom Zych:: hirUatghigu
Jeff Y:: hirUatghigu
7stud:: hirUatghigu
original: yO:ubeki
Alexander:: yOubeki
lonut:: yOubeki
Tom Zych:: yOubeki
Jeff Y:: yOubeki
7stud:: yOubeki
original: xE:aaa
Alexander:: xEaaa
lonut:: xEaaa
Tom Zych:: xEaaa
Jeff Y:: xEaaa
7stud:: xEaaa
original: xx:aaa
Alexander:: xx:aaa
lonut:: xx:aaa
Tom Zych:: xx:aaa
Jeff Y:: xx:aaa
7stud:: xx:aaa
original: xa:aaaxA:aaa
Alexander:: xa:aaaxAaaa
lonut:: xa:aaaxAaaa
Tom Zych:: xa:aaaxAaaa
Jeff Y:: xa:aaaxAaaa
7stud:: xa:aaaxAaaa
original: xa:aaaxA:aaaxx
Alexander:: xa:aaaxAaaaxx
lonut:: xa:aaaxAaaaxx
Tom Zych:: xa:aaaxAaaaxx
Jeff Y:: xa:aaaxAaaaxx
7stud:: xa:aaaxAaaaxx
original: xa:aaaxA:aaxax
Alexander:: xa:aaaxAaaxax
lonut:: xa:aaaxAaaxax
Tom Zych:: xa:aaaxAaaxax
Jeff Y:: xa:aaaxAaaxax
7stud:: xa:aaaxAaaxax
original: a:aaaxA:aaxax
Alexander:: a:aaaxAaaxax
lonut:: a:aaaxAaaxax
Tom Zych:: a:aaaxAaaxax
Jeff Y:: a:aaaxAaaxax
7stud:: a:aaaxAaaxax
original: e:aeixA:aexix
Alexander:: e:aeixAaexix
lonut:: e:aeixAaexix
Tom Zych:: e:aeixAaexix
Jeff Y:: e:aeixAaexix
7stud:: e:aeixAaexix