Question

我正在处理《自动化无聊的东西》第7章中的“ Strip（）的Regex版本”实践问题。我已经看到使用'+char+'将函数参数直接拉到regex编译中，但是我不了解这种格式是如何工作的。

def pseudoStrip(inputString, char='\s'):
    stripRegex = re.compile(r'^'+char+'|'+char+'+$')
    print(stripRegex.sub('', inputString))

'+char+'与['+char+']相同吗？

执行此任务是否更具可读性或Python风格？

Answer 1

将您的正则表达式放在此处，它将告诉您表达式的作用：https://regex101.com/#python

Answer 2

这可能是解决这个问题的最不 Pythonic 的方法，但它只使用了本书中到目前为止所教授的内容。它适用于除 \（反斜杠）之外的所有字符。

import re

def regexStrip(string,characters):
    mo = spaceStripRegex.search(string) # Calls upon the global regex for separating left and right white space from content
    string = mo.group(2)    # Isolates the string content from bounding white space and re-assigns it to a variable
    
    characters = '[' + characters + ']' # Stores other characters to be stripped in format compatible with character class
    
    # Regex for stripping other characters contains argument built via string concatenation as opposed to single raw string
    characterStripRegex = re.compile(
        '^' + characters + '*' +    # Zero or more of the characters to be stripped on left side of content
        r'(.*?)' +  # Defines unstripped content as the only group. Nongreedy so as not to include characters to be stripped on right side 
        characters + '*$')  # Zero or more of the characters to be stripped on the right side of the content
    mo = characterStripRegex.search(string)
    string = mo.group(1)
    print(string)
    
# Global regex that groups initial string into left white space, content, and right white space
spaceStripRegex = re.compile(r'''
    ^(\s)*  # Left white space if any
    (.*?)   # String content
    (\s)*$  # Right white space if any
    ''', re.VERBOSE)

string = '  **SpamSpamBaconSpamEggsSpamSpam   '
characters = 'ampS*'

regexStrip(string,characters)

Strip（）的正则表达式版本：在正则表达式编译中直接使用函数参数

2 个答案: