创建一个python正则表达式正则表达式,它将查找字符串中每个单词中的所有辅音,这些辅音不会一个接一个地重复

时间:2015-02-26 21:54:14

标签: python regex

例如,如果给出'Happy'这个词,我只想要'H'和'y'。

如果给出'完成',我只想要'm','p','l','s','h','d。

我知道(\ w)\ 2会找到重复的字符,而(?i)

[b-df-hj-np-tv-z]会找到所有辅音,但我该如何组合它们呢?

4 个答案:

答案 0 :(得分:2)

您可以使用

(?=[b-df-hj-np-tv-xz])(.)(?!\1)(?<!\1\1)

展开为

(?=[b-df-hj-np-tv-xz]) # Match only if the next character is a consonant
(.)                    # Match the consonant and capture it for subsequent usage
(?!\1)                 # Don't match if the next character if the same as the one we captured (avoid matching all but the last characters of a cluster)
(?<!\1\1)              # Don't match if the penultimate character was the same as the one we captured (to avoid matching the last character of a cluster)

但遗憾的是re中不允许最后一行,因为后卫必须有固定的长度。但是regex模块¹支持它

In [1]: import regex
In [2]: s=r'(?=[b-df-hj-np-tv-xz])(.)(?!\1)(?<!\1\1)'

In [3]: regex.findall(s, 'happy')
Out[3]: ['h']

In [4]: regex.findall(s, 'accomplished')
Out[4]: ['m', 'p', 'l', 's', 'h', 'd']

根据奶酪店的描述,

¹“最终将取代Python目前的重新模块实现。”

答案 1 :(得分:0)

from re import findall
string = "Happy you!"
res    = []
for c in findall('[^aeiou]', string): 
    if c not in res:
        res.append(c)   

过滤掉重复的内容,并根据您的要求使用“重复”内容。模块。

答案 2 :(得分:0)

这是一个可以使用的正则表达式:

([^aeiou])\1+|([^aeiou\s])

然后你可以抓住被捕获的组#2

RegEx Demo

<强>解释

[^aeiou]      # matches a consonant
([^aeiou])    # puts a consonant in captured group #1
([^aeiou])\1+ # matches repetitions of group #1
|             # regex alternation (OR)
([^aeiou\s])  # matches a consonant and grabs it in captured group #2

<强>代码:

>>> for m in re.finditer(r'([^aeiou])\1+|([^aeiou\s])', "accomplished"):
...     print m.group(2)
...
None
m
p
l
s
h
d

答案 3 :(得分:0)

蛮力(超慢)解决方案:

import re

expr = '(?<!b)b(?!b)|(?<!c)c(?!c)|(?<!d)d(?!d)|(?<!f)f(?!f)|(?<!g)g(?!g)|(?<!h)h(?!h)|(?<!j)j(?!j)|(?<!k)k(?!k)|(?<!l)l(?!l)|(?<!m)m(?!m)|(?<!n)n(?!n)|(?<!p)p(?!p)|(?<!q)q(?!q)|(?<!r)r(?!r)|(?<!s)s(?!s)|(?<!t)t(?!t)|(?<!v)v(?!v)|(?<!w)w(?!w)|(?<!x)x(?!x)|(?<!y)y(?!y)|(?<!z)z(?!z)'

print re.findall(expr, 'happy')
print re.findall(expr, 'accomplished')
print re.findall(expr, 'happy accomplished')
print re.findall(expr, 'happy accccccompliiiiiiishedd')

# Readable form of expr
# (?<!b)b(?!b)|
# (?<!c)c(?!c)|
# (?<!d)d(?!d)|
# (?<!f)f(?!f)|
# (?<!g)g(?!g)|
# (?<!h)h(?!h)|
# (?<!j)j(?!j)|
# (?<!k)k(?!k)|
# (?<!l)l(?!l)|
# (?<!m)m(?!m)|
# (?<!n)n(?!n)|
# (?<!p)p(?!p)|
# (?<!q)q(?!q)|
# (?<!r)r(?!r)|
# (?<!s)s(?!s)|
# (?<!t)t(?!t)|
# (?<!v)v(?!v)|
# (?<!w)w(?!w)|
# (?<!x)x(?!x)|
# (?<!y)y(?!y)|
# (?<!z)z(?!z)

输出:

['h', 'y']
['m', 'p', 'l', 's', 'h', 'd']
['h', 'y', 'm', 'p', 'l', 's', 'h', 'd']
['h', 'y', 'm', 'p', 'l', 's', 'h']