第7章,使用Python自动化无聊的东西,练习项目:正则表达式版本的strip()

时间:2016-01-22 19:02:24

标签: python regex

我正在读这本书"使用Python自动化无聊的东西'。在第7章中,在项目实践中:strip()的正则表达式版本,这里是我的代码(我使用Python 3.x):

def stripRegex(x,string):
import re
if x == '':
    spaceLeft = re.compile(r'^\s+')
    stringLeft = spaceLeft.sub('',string)
    spaceRight = re.compile(r'\s+$')
    stringRight = spaceRight.sub('',string)
    stringBoth = spaceRight.sub('',stringLeft)
    print(stringLeft)
    print(stringRight)

else:
    charLeft = re.compile(r'^(%s)+'%x)
    stringLeft = charLeft.sub('',string)
    charRight = re.compile(r'(%s)+$'%x)
    stringBoth = charRight.sub('',stringLeft)
print(stringBoth)

x1 = ''
x2 = 'Spam'
x3 = 'pSam'
string1 = '      Hello world!!!   '
string2 = 'SpamSpamBaconSpamEggsSpamSpam'
stripRegex(x1,string1)
stripRegex(x2,string2)
stripRegex(x3,string2)

这是输出:

Hello world!!!   
      Hello world!!!
Hello world!!!
BaconSpamEggs
SpamSpamBaconSpamEggsSpamSpam

所以,我的strip()的正则表达式版本几乎可以作为原始版本使用。在origninal版本中,输出始终是" BaconSpamEggs"无论你是否通过了“Spam'”,“#S;'' mapS',' Smpa' ...那么如何在Regex中解决此问题版本???

13 个答案:

答案 0 :(得分:1)

import re

def regexStrip(x,y=''):


if y!='':
    yJoin=r'['+y+']*([^'+y+'].*[^'+y+'])['+y+']*'
    cRegex=re.compile(yJoin,re.DOTALL)
    return cRegex.sub(r'\1',x)
else:
    sRegex=re.compile(r'\s*([^\s].*[^\s])\s*',re.DOTALL)
    return sRegex.sub(r'\1',x)

text='  spmaHellow worldspam'
print(regexStrip(text,'spma'))

答案 1 :(得分:0)

您可以像这样检查正则表达式中的多个字符:

charLeft = re.compile(r'^([%s]+)' % 'abc') 
print charLeft.sub('',"aaabcfdsfsabca")
>>> fdsfsabca

甚至更好,在一个正则表达式中执行:

def strip_custom(x=" ", text):
    return re.search(' *[{s}]*(.*?)[{s}]* *$'.format(s=x), text).group(1)

split_custom('abc', ' aaabtestbcaa ')
>>> test

答案 2 :(得分:0)

我改变了论点,但是从我的快速测试来看,这似乎有效。我给它一个可选参数,默认为None

def stripRegex(s,toStrip=None):
    import re
    if toStrip is None:
        toStrip = '\s'
    return re.sub(r'^[{0}]+|[{0}]+$'.format(toStrip), '', s)
x1 = ''
x2 = 'Spam'
x3 = 'pSam'
string1 = '      Hello world!!!   '
string2 = 'SpamSpamBaconSpamEggsSpamSpam'

print(stripRegex(string1)) # 'Hello world!!!'
print(stripRegex(string1, x1)) # '      Hello world!!!   '
print(stripRegex(string2, x2)) # 'BaconSpamEggs'
print(stripRegex(string2, x3)) # 'BaconSpamEggs'

答案 3 :(得分:0)

我写了两个不同的代码: 第一种方式:

import re    
def stripfn(string, c):
        if c != '':
            Regex = re.compile(r'^['+ c +']*|['+ c +']*$')
            strippedString = Regex.sub('', string)
            print(strippedString)
        else:
            blankRegex = re.compile(r'^(\s)*|(\s)*$')
            strippedString = blankRegex.sub('', string)
            print(strippedString)

第二路:

import re
def stripfn(string, c):
    if c != '':
        startRegex = re.compile(r'^['+c+']*')
        endRegex = re.compile(r'['+c+']*$')
        startstrippedString = startRegex.sub('', string)
        endstrippedString = endRegex.sub('', startstrippedString)
        print(endstrippedString)
    else:
        blankRegex = re.compile(r'^(\s)*|(\s)*$')
        strippedString = blankRegex.sub('', string)
        print(strippedString)

答案 4 :(得分:0)

这似乎有效:

def stripp(text, leftright = None):
    import re
    if leftright == None:
        stripRegex = re.compile(r'^\s*|\s*$')
        text = stripRegex.sub('', text)
        print(text)
    else:
        stripRegex = re.compile(r'^.|.$')
        margins = stripRegex.findall(text)
        while margins[0] in leftright:
            text = text[1:]
            margins = stripRegex.findall(text)
        while margins[-1] in leftright:
            text = text[:-2]
            margins = stripRegex.findall(text)
        print(text) 

mo = '    @@@@@@     '
mow = '@&&@#$texttexttext&&^&&&&%%'
bla = '@&#$^%+'

stripp(mo)
stripp(mow, bla)

答案 5 :(得分:0)

这是我的版本:

    #!/usr/bin/env python3

import re

def strippp(txt,arg=''): # assigning a default value to arg prevents the error if no argument is passed when calling strippp()
    if arg =='':
        regex1 = re.compile(r'^(\s+)')
        mo = regex1.sub('', txt)
        regex2 = re.compile(r'(\s+)$')
        mo = regex2.sub('', mo)
        print(mo)
    else:
        regex1 = re.compile(arg)
        mo = regex1.sub('', txt)
        print(mo)

text = '        So, you can create the illusion of smooth motion        '
strippp(text, 'e')
strippp(text)

答案 6 :(得分:0)

@rtemperv的解决方案缺少一个字符串以空白字符开头/结尾但没有提供删除的字符的情况。

>>> var="     foobar"
>>> var.strip('raf')
'     foob'

因此正则表达式应该有点不同:

def strip_custom(x=" ", text):
    return re.search('^[{s}]*(.*?)[{s}]*$'.format(s=x), text).group(1)

答案 7 :(得分:0)

请参见下面的代码

from re import *
check = '1'
while(check == '1'):
    string = input('Enter the string: ')
    strToStrip = input('Enter the string to strip: ')
    if strToStrip == '':                              #If the string to strip is empty
        exp = compile(r'^[\s]*')                      #Looks for all kinds of spaces in beginning until anything other than that is found
        string = exp.sub('',string)                   #Replaces that with empty string
        exp = compile(r'[\s]*$')                      #Looks for all kinds of spaces in the end until anything other than that is found
        string = exp.sub('',string)                   #Replaces that with empty string
        print('Your Stripped string is \'', end = '')
        print(string, end = '')
        print('\'')
    else:
        exp = compile(r'^[%s]*'%strToStrip)           #Finds all instances of the characters in strToStrip in the beginning until anything other than that is found
        string = exp.sub('',string)                   #Replaces it with empty string
        exp = compile(r'[%s]*$'%strToStrip)           #Finds all instances of the characters in strToStrip in the end until anything other than that is found
        string = exp.sub('',string)                   #Replaces it with empty string
        print('Your Stripped string is \'', end = '')
        print(string, end = '')
        print('\'')
    print('Do you want to continue (1\\0): ', end = '')
    check = input()

说明:

  • 字符类[]用于检查字符串中字符的各个实例。

  • ^用于检查要删除的字符串中的字符是否在开头

  • $用于检查要删除的字符串中的字符是否在结尾
  • 如果发现它们被empty string替换为sub()

  • *用于匹配要删除的字符串中的最大字符,直到找到其他字符为止。

  • *匹配0表示找不到任何实例,或匹配多个实例。

答案 8 :(得分:0)

#! python
# Regex Version of Strip()
import re
def RegexStrip(mainString,charsToBeRemoved=None):
    if(charsToBeRemoved!=None):
        regex=re.compile(r'[%s]'%charsToBeRemoved)#Interesting TO NOTE
        return regex.sub('',mainString)
    else:
        regex=re.compile(r'^\s+')
        regex1=re.compile(r'$\s+')
        newString=regex1.sub('',mainString)
        newString=regex.sub('',newString)
        return newString

Str='   hello3123my43name is antony    '
print(RegexStrip(Str))

我认为这是一个相当舒适的代码,我发现插入符号(^)和美元($)确实有效。

答案 9 :(得分:0)

import re
def strips(arg, string):
    beginning = re.compile(r"^[{}]+".format(arg))        
    strip_beginning = beginning.sub("", string)
    ending = re.compile(r"[{}]+$".format(arg))
    strip_ending = ending.sub("", strip_beginning)
    return strip_ending

功能条将剥离“ arg”所指的任何内容,而不管其是否发生

答案 10 :(得分:0)

我相信这个正则表达式可能更容易理解:

import re

strip_reg =  re.compile("\s*(.*?)\s*$")
strip_rep.search(<mystring>).group(1)

它是如何工作的? 让我们倒退吧。我们在字符串“\s*$”的末尾再找一个空格

“.*?”是一种特殊情况,您要求正则表达式查找要匹配的最少字符数。 (大多数情况下,正则表达式会尝试获取最多) 我们捕捉到了这一点。

我们尝试在我们捕获的组之前捕获零个或多个字符。

答案 11 :(得分:0)

我的解决方案:

import re

text = """
 Write a function that takes a string and does the same thing as the strip() 
string method. If no other arguments are passed other than the string to 
strip, then whitespace characters will be removed from the beginning and 
end of the string. Otherwise, the characters specified in the second argu -
ment to the function will be removed from the string. 
"""

def regexStrip(text, charsToStrip=''):
    if not charsToStrip:
        strip = re.sub(r'^\s+|\s+$', '', text)
    else:
        strip = re.sub(charsToStrip, '', text)
    return strip

while True:
    arg2 = input('Characters to strip: ')
    print(regexStrip(text, arg2))

答案 12 :(得分:-1)

以下是我尝试应用从R.C.的“清洁代码”中学到的课程。马丁和Al Sweigart撰写的“使无聊的东西自动化”。干净代码的规则之一是编写小的函数并做一件事。

def removeSpacesAndSecondString(text):
    print(text)
    stripSecondStringRegex = re.compile(r'((\w+)\s(\w+)?)')
    for groups in stripSecondStringRegex.findall(text):
        newText = groups[1]
    print(newText)

def removeSpaces(text):
    print(text)
    stripSpaceRegex = re.compile(r'\s')
    mo = stripSpaceRegex.sub('', text)
    print(mo)

text = '"  hjjkhk  "'

if len(text.split()) > 1:
    removeSpacesAndSecondString(text)
else:
    removeSpaces(text)