Question

是否可以使用正则表达式删除特殊字符？

我正试图修剪：

\n\t\t\t\t\t\t\t\t\t\tButte County High School\t\t\t\t\t\t\t\t\t

下至：

Butte County High School

使用

regexform = re.sub("[A-Z]+[a-z]+\s*",'', schoolstring)
print regexform

Answer 1

这个简单的任务你不需要正则表达式。请改用string.strip()。例如：

>>> my_string = '\t\t\t\t\t\t\t\t\t\tButte County High School\t\t\t\t\t\t\t\t\t'
>>> my_string.strip()
'Butte County High School'

如果必须使用regex，则表达式应为：

>>> re.sub('[^A-Za-z0-9]\s+', '', my_string)
'Butte County High School'

匹配不是字母或数字的字符串。

Answer 2

如果你真的开始使用正则表达式：

re.sub(r'^\s+|\s+$', '', schoolstring)

这适用于：

'   this is a test   '   # multiple leading and trailing spaces
' this is a test '       # one leading and trailing space
'this is a test'         # no leading or trailing spaces
'\t\tthis is a test\t\t' # leading or trailing whitespace characters

此表达式表示字符串的起始^\s+中的一个或多个空格字符，或|字符串\s+$末尾的一个或多个空格字符。

但是，string.strip()更容易删除前导空格和尾随空格。

Answer 3

除非您有理由想要使用正则表达式，否则可以使用python中的.strip()函数删除所有边空格

Python正则表达式：修剪特殊字符

3 个答案: