Question

如果在前10个单词中至少有5个大写字母（前面还有空格），我正在创建一个匹配句子的正则表达式。我的正则表达式如下：

(^(?:\w+\s(?= [A-Z]{5})){10}.*(?:\n|$))

我的想法是：

^ Match start of string  
?: look for word followed by a boundary i.e a space     
?= Match if Capital letters preceded by a space  
.* - match everything till line end / end string.

我想我需要重组这个，但我不知道该怎么做。 {10}是前10个单词，但看起来错误。

示例字符串：
匹配 - Lets Search For Water somewhere Because I am thirsty and i really am , wishing for a desert rain

不匹配 - fully lowercase or maybe One UPPERCASE but there are actually two uppercase letters that are preceded by a space.

Answer 1

你是否被锁定使用正则表达式？如果不是：

# Python 2.7
def checkCaps(text):
  words = text.split()
  caps = 0
  for word in words[:10]:
    if word[0].isupper(): caps += 1
  return caps >= 5

编辑反映来自@Kevin和@KarlKnechtel的良好反馈（并删除了残障）

在翻译中试了一下：

>>> checkCaps('Lets Search For Water somewhere Because I am thirsty and i really am , wishing for a desert rain')
True
>>> checkCaps('fully lowercase or maybe One UPPERCASE but there are actually two uppercase letters that are preceded by a space.')
False

Answer 2

我同意，正常表达式并不是为此任务而构建的。您可以查找一定数量的连续匹配，但是如果您需要记录＆＃34;其他内容＆＃34;。

你的任务是在概念上围绕单词，因此将字符串视为单词（通过首先将其剪切成单词）的方法更有意义，正如@rchang所示。使它变得更强大，添加文档并更优雅地进行计数（简单的方法也很好，但我真的不喜欢显示循环用于＆＃34;计算＆＃34;，建立列表等等）：

def enough_capitalized_words(text, required, limit):
    """Determine if the first `limit` words of the `text`
    contain the `required` number of capitalized words."""
    return sum(
        word[0].isupper()
        for word in text.split()[:limit]
    ) >= required

Answer 3

reduce(lambda count, word: count + word[0].isupper(), text.split()[:10], 0) >= 5

正则表达式匹配前10个单词中的5个或更多大写字母，Python

3 个答案: