在字符串中查找短语

时间:2019-01-09 16:46:58

标签: python string

我正在尝试检查字符串中是否存在短语“ purple cow”。 “紫色”和“牛”之间必须至少有一个空格或标点符号; “紫牛”是不可接受的。我尝试了以下程序,但收到错误消息。

import string

def findPC(string):

    strLower = string.lower()

    # remove 'purplecow' in strLower
    strLowerB = strLower.replace('purplecow', '')
    print(strLowerB)

    strList = list(strLowerB)
    print(strList)

    # remove punctuation in strLowerB
    punct = string.punctuation()
    for char in strList:
        if char in punct:
            strList.replace(char, '')

    # remove spaces in strLowerB
    strLower.replace(' ', '')
    print(strLower)

    # look for 'purplecow' in strLowerB
    return 'purplecow' in string


print(findPC('The purple cow is soft and cuddly. purplecow. Purple^&*(^&$cow.'))

错误消息:

Traceback (most recent call last):   File "C:/Python36/findPC.py",
line 28, in <module>
    print(findPC('The purple cow is soft and cuddly. purplecow. Purple^&*(^&$cow.'))   File "C:/Python36/findPC.py", line 15, in
findPC
    punct = string.punctuation() AttributeError: 'str' object has no attribute 'punctuation'

4 个答案:

答案 0 :(得分:2)

代码中的错误源于您在两个地方使用string的含义不同。我已经对您的代码进行了一些编辑,以使其按预期的方式工作。

import string

def findPC(input_string):

    strLower = input_string.lower()

    # remove 'purplecow' in strLower
    strLowerB = strLower.replace('purplecow', '')
    print(strLowerB)

    # remove punctuation in strLowerB
    punct = string.punctuation
    for char in punct:
      strLowerB = strLowerB.replace(char, '')

    # remove spaces in strLowerB
    strLowerB.replace(' ', '')
    print(strLowerB)

    # look for 'purplecow' in strLowerB
    return 'purplecow' in strLowerB


print(findPC('The purple cow is soft and cuddly. purplecow. Purple^&*(^&$cow.'))

答案 1 :(得分:1)

使用正则表达式

import re

# 'at least space or punctuation mark` - depends on that is treated a punctuation mark. I've put comma and hyphen, you can extend the list
r = r'purple[\s\,\-]+cow' 
s = 'The purple cow is soft and cuddly. purplecow.Purple^&*(^&$cow.'

print('Found' if re.search(r, s) else 'Not found')

答案 2 :(得分:1)

如果可以使用正则表达式,则可以使用purple[ .,\/#!$%\^&\*;:{}=\-_`~()]+cow形式的正则表达式来实现它,该正则表达式与您想要的匹配。

注意:方括号中的字符是您正在考虑的“标点符号”。 +表示您正在连续匹配一个或多个这些方括号中的字符。

这是在Python中实现的,就像这样:

import re
re.search(r"purple[ .,\/#!$%\^&\*;:{}=\-_`~()]+cow", string)

re.search(pattern, string)将为您提供一个re.Match对象,其中包含有关该匹配项的更多信息(如果没有匹配项,则为None),但是如果您只想true/false表示正则表达式是否匹配的值,您可以这样实现:

matched = not re.search(pattern, string) == None

这意味着您可以这样实现代码:

import re
def findPC(s):
    return not re.search(r"purple[ .,\/#!$%\^&\*;:{}=\-_`~()]+cow", s) == None

您可以在诸如https://regexr.com/463uk这样的网站上测试正则表达式,例如该正则表达式。

编辑:改进的正则表达式

答案 3 :(得分:1)

使用正则表达式更改带空格的标点符号,然后再使用另一个正则表达式删除多余的空格怎么办:

import re
string =re.sub("[.!?\\-,]"," ",string)
string= re.sub("\s+"," ",string)
然后,您可以将我们设为“ in”:
"purple cow" in string

因此最终功能变为:

def has_purple_cow(string):
    import re
    string =re.sub("[.!?\\-,]"," ",string)
    string= re.sub("\s+"," ",string)
    return "purple cow" in string