使用正则表达式：

Question

我正在研究PythonChallenge＃3。我有一大堆文字，我必须整理。我试图找到一个序列，其中第一个和最后三个字母是大写字母，中间一个是小写字母。

我的函数遍历文本。变量块存储当前循环的七个字母。有一个变量toPrint，它根据块中的字母是否与我的模式（AAAaAAA）相对应而打开和关闭。基于根据我的功能打印的最后一个块，我的循环在我的文本中提前停止。我不知道为什么会这样，如果你能帮我解决这个问题，那就太好了。

text = """kAewtloYgcFQaJNhHVGxXDiQmzjfcpYbzxlWrVcqsmUbCunkfxZWDZjUZMiGqhRRiUvGmYmvnJ"""
words = []
for i in text:
    toPrint = True
    block = text[text.index(i):text.index(i)+7]
    for b in block[:3]:
        if b.isupper() == False:
            toPrint = False
    for b in block[3]:
        if b.islower() == False:
            toPrint = False
    for b in block[4:]:
        if b.isupper() == False:
            toPrint = False
    if toPrint == True and block not in words:
        words.append(block)
print (block)
print (words)

Answer 1

使用正则表达式：

这是使用正则表达式的好时机，它超快，更清晰，并且不需要一堆嵌套的if语句。

import re
text = """kAewtloYgcFQaJNhHVGxXDiQmzjfcpYbzxlWrVcqsmUbCunkfxZWDZjUZMiGqhRRiUvGmYmvnJ"""
print(re.search(r"[A-Z]{3}[a-z][A-Z]{3}", text).group(0))

正则表达式的解释：
[A-Z] {3] ---＆gt;匹配任何3个大写字母
[a-z] -------＆gt;匹配单个小写字母
[A-Z] {3] ---＆gt;匹配3个大写字母

没有正则表达式：

如果你真的不想使用正则表达式，那么你就是这样做的：

text = """kAewtloYgcFQaJNhHVGxXDiQmzjfcpYbzxlWrVcqsmUbCunkfxZWDZjUZMiGqhRRiUvGmYmvnJ"""

for i, _ in enumerate(text[:-6]): #loop through index of each char (not including last 6)
    sevenCharacters = text[i:i+7] #create chunk of seven characters
    shouldBeCapital = sevenCharacters[0:3] + sevenCharacters[4:7] #combine all the chars that should be cap into list

    if (all(char.isupper() for char in shouldBeCapital)): #make sure all those characters are indeeed capital
        if(sevenCharacters[3].islower()): #make sure middle character is lowercase
            print(sevenCharacters)

Answer 2

如果我理解你的问题，那么根据我的意见，不需要循环。我这个简单的代码可以找到所需的序列。

# Use this code

text = """kAewtloYgcFQaJNhHVGxXDiQmzjfcpYbzxlWrVcqsmUbCunkfxZWDZjUZMiGqhRRiUvGmYmvnJ"""

import re

print(re.findall("[A-Z]{3}[a-z][A-Z]{3}", text))

Answer 3

我认为您的第一个问题是您正在使用str.index()。与find()类似，字符串的.index()方法返回找到的第一个匹配的索引。

因此，在您的示例中，每当您搜索“x”时，您将获得找到的第一个“x”的索引，等等。您无法成功处理字符串中不唯一的任何字符，或者不能第一次出现重复的角色。

为了保持相同的结构（这是不必要的 - 使用我更喜欢的枚举来发布答案）我使用你的块变量实现了排队方法。每次迭代时，一个字符从块的前面删除，而新字符被附加到末尾。

我还清除了与False的一些不必要的比较。您会发现这不仅效率低下，而且经常出错，因为您执行的许多“布尔”活动都不会出现在实际的布尔值上。摆脱拼写True/False的习惯。只需使用if c或if not c。

结果如下：

text = """kAewtloYgcFQaJNhHVGxXDiQmzjfcpYbzxlWrVcqsmUbCunkfxZWDZjUZMiGqhRRiUvGmYmvnJ"""
words = []
block = '.' + text[0:6]
for i in text[6:]:
    block = block[1:] + i  # Drop 1st char, append 'i'
    toPrint = True
    for b in block[:3]:
        if not b.isupper():
            toPrint = False
    if not block[3].islower():
        toPrint = False
    for b in block[4:]:
        if not b.isupper():
            toPrint = False
    if toPrint and block not in words:
        words.append(block)
print (words)

Python挑战＃3：循环过早停止

3 个答案:

使用正则表达式：

没有正则表达式：