Question

我试图找到以辅音开头和结尾的单词。以下是我的尝试，而不是我想要的。我真的被困住了，需要你的帮助/建议。

import re

a = "Still, the conflicting reports only further served to worsen tensions in the Ukraine crisis, which has grown drastically \
in the past few weeks to a new confrontation between Russia and the West reminiscent of low points in the Cold War." 

b = re.findall(" ([b, c, d, f, g, h, j, k, l, m, n, p, q, r, s, t, v, w, x, y, z, ',', '.'].+?[b, c, d, f, g, h, j, k, l, m, n, p, q, r, s, t, v, w, x, y, z, ',', '.']) ", a.lower())
print(b)

输出是：

['the conflicting', 'further', 'to worsen', 'the ukraine crisis,', 'has', 'drastically', 'the past', 'weeks', 'new', 'between', 'the west', 'low', 'the cold']

但输出不正确。我必须使用正则表达式。没有它，我想是太难了。

非常感谢！

Answer 1

以下是使用startswith()和endswith()的非常明确的解决方案。为了实现您的目标，您必须自己删除特殊字符并将字符串转换为单词列表（在代码中命名为s）：

vowels = ('a', 'e', 'i', 'o', 'u')
[w for w in s if not w.lower().startswith(vowels) and not w.lower().endswith(vowels)]

Answer 2

试试这个：

vowels = ['a', 'e', 'i', 'o', 'u']
words = [w for w in a.split() if w[0] not in vowels and w[-1] not in vowels]

然而，这不会处理以.和,

结尾的单词

编辑：如果你必须使用正则表达式找到模式：

ending_in_vowel = r'(\b\w+[AaEeIiOoUu]\b)?' #matches all words ending with a vowel
begin_in_vowel = r'(\b[AaEeIiOoUu]\w+\b)?' #matches all words beginning with a vowel

然后我们需要找到所有不以元音开头也不以元音结尾的单词

ignore = [b for b in re.findall(begin_in_vowel, a) if b]
ignore.extend([b for b in re.findall(ending_in_vowel, a) if b])

然后你的结果就是：

result = [word for word in a.split() if word not in ignore]

Answer 3

首先，你应split() a，以便获得每个单词。然后检查第一个字母和最后一个字母是否在列表consonants中。如果是，您将append改为all，最后打印all的内容。

consonants = ['b', 'c', 'd', 'f', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'w', 'x', 'y', 'z']

a = "Still, the conflicting reports only further served to worsen tensions in the Ukraine crisis, which has grown drastically \
in the past few weeks to a new confrontation between Russia and the West reminiscent of low points in the Cold War."

all = []

for word in a.split():
    if word[0] in consonants and word[len(word)-1] in consonants:
        all.append(word)

print all

Answer 4

如果您要删除标点符号，则此正则表达式将起作用：

>>> re.findall(r'\b[bcdfghj-np-tv-z][a-z]*[bcdfghj-np-tv-z]\b', a.lower())
['still', 'conflicting', 'reports', 'further', 'served', 'worsen', 'tensions', 'crisis', 'which', 'has', 'grown', 'drastically', 'past', 'few', 'weeks', 'new', 'confrontation', 'between', 'west', 'reminiscent', 'low', 'points', 'cold', 'war']

然而，您原来的尝试看起来似乎是在试图保留逗号和句号，所以如果这是您的目标，您可以使用它：

>>> re.findall(r'\b[bcdfghj-np-tv-z][a-z]*[bcdfghj-np-tv-z][,.]?(?![a-z])', a.lower())
['still,', 'conflicting', 'reports', 'further', 'served', 'worsen', 'tensions', 'crisis,', 'which', 'has', 'grown', 'drastically', 'past', 'few', 'weeks', 'new', 'confrontation', 'between', 'west', 'reminiscent', 'low', 'points', 'cold', 'war.']

我不确定为什么我的第一个例子中的\b通常不会与尾随标点符号（文档说它会匹配）相匹配，但无论如何这些都有效。

如果你想考虑收缩，那么表达式就是这样：

r"\b[bcdfghj-np-tv-z][a-z']*[bcdfghj-np-tv-z][,.]?(?![a-z])"

找到以辅音开头和结尾的单词

4 个答案: