我有一个想法是使用正则表达式模式作为模板,并想知道是否有一种方便的方法在Python(3或更新版本)中这样做。
import re
pattern = re.compile("/something/(?P<id>.*)")
pattern.populate(id=1) # that is what I'm looking for
应该导致
/something/1
答案 0 :(得分:4)
这不是正则表达式的用法,你可以使用普通的字符串格式。
>>> '/something/{id}'.format(id=1)
'/something/1'
答案 1 :(得分:1)
保存编译直到替换之后:
pattern = re.compile("/something/(?P<%s>.*)" % 1)
答案 2 :(得分:1)
以下是我创建的轻量级课程,可以完成您正在寻找的课程。您可以编写单个正则表达式,并将该表达式用于两个匹配字符串并生成字符串。
代码底部有一个关于如何使用它的小例子。
通常,您通常会构造正则表达式,并且正常使用match
和search
函数。 format
函数与string.format
非常相似,用于生成新字符串。
import re
regex_type = type(re.compile(""))
# This is not perfect. It breaks if there is a parenthesis in the regex.
re_term = re.compile(r"(?<!\\)\(\?P\<(?P<name>[\w_\d]+)\>(?P<regex>[^\)]*)\)")
class BadFormatException(Exception):
pass
class RegexTemplate(object):
def __init__(self, r, *args, **kwargs):
self.r = re.compile(r, *args, **kwargs)
def __repr__(self):
return "<RegexTemplate '%s'>"%self.r.pattern
def match(self, *args, **kwargs):
'''The regex match function'''
return self.r.match(*args, **kwargs)
def search(self, *args, **kwargs):
'''The regex match function'''
return self.r.search(*args, **kwargs)
def format(self, **kwargs):
'''Format this regular expression in a similar way as string.format.
Only supports true keyword replacement, not group replacement.'''
pattern = self.r.pattern
def replace(m):
name = m.group('name')
reg = m.group('regex')
val = kwargs[name]
if not re.match(reg, val):
raise BadFormatException("Template variable '%s' has a value "
"of %s, does not match regex %s."%(name, val, reg))
return val
# The regex sub function does most of the work
value = re_term.sub(replace, pattern)
# Now we have un-escape the special characters.
return re.sub(r"\\([.\(\)\[\]])", r"\1", value)
def compile(*args, **kwargs):
return RegexTemplate(*args, **kwargs)
if __name__ == '__main__':
# Construct a typical URL routing regular expression
r = RegexTemplate(r"http://example\.com/(?P<year>\d\d\d\d)/(?P<title>\w+)")
print r
# This should match
print r.match("http://example.com/2015/article")
# Generate the same URL using url formatting.
print r.format(year = "2015", title = "article")
# This should not match
print r.match("http://example.com/abcd/article")
# This will raise an exception because year is not formatted properly
try:
print r.format(year = "15", title = "article")
except BadFormatException as e:
print e
有一些限制:
\1
中的string.format
样式格式。RegexTemplate(r'(?P<foo>biz(baz)?)')
。这可以通过一些工作来纠正。[a-z123]
),我们将不知道如何格式化这些字符类。答案 3 :(得分:1)
对于非常简单的情况,可能最简单的方法是用格式字段替换命名的捕获组。
这是一个基本的验证器/格式化器:
import re
from functools import partial
unescape = partial(re.compile(r'\\(.)').sub, r'\1')
namedgroup = partial(re.compile(r'\(\?P<(\w+)>.*?\)').sub, r'{\1}')
class Mould:
def __init__(self, pattern):
self.pattern = re.compile(pattern)
self.template = unescape(namedgroup(pattern))
def format(self, **values):
try:
return self.template.format(**values)
except KeyError as e:
raise TypeError(f'Missing argument: {e}') from None
def search(self, string):
try:
return self.pattern.search(string).groupdict()
except AttributeError:
raise ValueError(string) from None
因此,例如,以 (XXX) YYY-ZZZZ
形式为电话号码实例化验证器/格式化器:
template = r'\((?P<area>\d{3})\)\ (?P<prefix>\d{3})\-(?P<line>\d{4})'
phonenum = Mould(template)
然后:
>>> phonenum.search('(333) 444-5678')
{'area': '333', 'prefix': '444', 'line': '5678'}
>>> phonenum.format(area=111, prefix=555, line=444)
(111) 555-444
但这是一个非常基本的骨架,它忽略了许多正则表达式功能(例如环视或非捕获组)。如果需要它们,事情很快就会变得非常混乱。在这种情况下,相反:从模板生成模式,虽然更冗长,但可能更灵活,更不容易出错。
这是基本的验证器/格式化器(.search()
和 .format()
是相同的):
import string
import re
FMT = string.Formatter()
class Mould:
def __init__(self, template, **kwargs):
self.template = template
self.pattern = self.make_pattern(template, **kwargs)
@staticmethod
def make_pattern(template, **kwargs):
pattern = ''
# for each field in the template, add to the pattern
for text, field, *_ in FMT.parse(template):
# the escaped preceding text
pattern += re.escape(text)
if field:
# a named regex capture group
pattern += f'(?P<{field}>{kwargs[field]})'
# XXX: if there's text after the last field,
# the parser will iterate one more time,
# hence the 'if field'
return re.compile(pattern)
实例化:
template = '({area}) {prefix}-{line}'
content = dict(area=r'\d{3}', prefix=r'\d{3}', line=r'\d{4}')
phonenum = Mould(template, **content)
执行:
>>> phonenum.search('(333) 444-5678')
{'area': '333', 'prefix': '444', 'line': '5678'}
>>> phonenum.format(area=111, prefix=555, line=444)
(111) 555-444