我想知道是否有更简单的替代方法(例如单个函数调用)来匹配和替换以下示例:
>>> import re
>>>
>>> line = 'file:///windows-d/academic%20discipline/study%20objects/areas/formal%20systems/math'
>>>
>>> match = re.match(r'^file://(.*)$', line)
>>> if match and match.group(1):
... substitution = re.sub(r'%20', r' ', match.group(1))
...
>>> substitution
'/windows-d/academic discipline/study objects/areas/formal systems/math'
感谢。
答案 0 :(得分:5)
我要回避你的正则表达式问题并建议你使用别的东西:
>>> line = 'file:///windows-d/academic%20discipline/study%20objects/areas/formal%20systems/math'
>>> import urllib
>>> urllib.unquote(line)
'file:///windows-d/academic discipline/study objects/areas/formal systems/math'
然后用切片剥离file://
或必要时str.replace
。
%20
(空格)不是此处唯一可能的转义字符,所以最好使用正确的工具进行工作,而不是让你的正则表达式解决方案在以后有另一个字符需要时逃跑。
答案 1 :(得分:2)
您可以尝试以下简单的python代码,
>>> import re
>>> line = 'file:///windows-d/academic%20discipline/study%20objects/areas/formal%20systems/math'
>>> m = re.sub(r'%20|file://', r' ', line).strip()
>>> m
'/windows-d/academic discipline/study objects/areas/formal systems/math'
re.sub(r'%20|file://', r' ', line).strip()
代码用空格替换字符串%20
或file://
。同样,strip()
函数会删除所有前导和尾随空格。
答案 2 :(得分:2)
>>> import re
>>> s = 'file:///windows-d/academic%20discipline/study%20objects/areas/formal%20systems/math'
>>> re.sub(r'^file://(.*)$', lambda m: m.group(1).replace('%20', ' '), s)
'/windows-d/academic discipline/study objects/areas/formal systems/math'
>>> s = 'file:///windows-d/academic%20discipline/study%20objects/areas/formal%20systems/math'
>>> s.replace('file://', '').replace('%20', ' ')
'/windows-d/academic discipline/study objects/areas/formal systems/math'