带有原始字符串前缀的Python re.sub用H替换10,用P

时间:2018-01-01 01:22:14

标签: python regex

我正在构建一些模板到我的应用程序中,它采用这样的字符串:

templateString = r'{title} {- author} ({timestamp})'

并替换模板{}中的相应字符串(如果存在):

# 10 things on my todo list - Brandon (2018-01-01 00:00:01)

我想在模板字符串中允许自定义字符,以便用户可以连字符或括号或其他内容,但仅当存在这些属性时,例如如果作者是空的,你不想看到:

# 10 things on my todo list -  (2018-01-01 00:00:01)

你想看到:

# 10 things on my todo list (2018-01-01 00:00:01)

为此,我尝试使用捕获组来抓取{,识别字和}之间存在的任何文字:

titleExp = re.compile(r'\{([^\{]*)title([^\}]*)\}', re.I)
authorExp = re.compile(r'\{([^\{]*)author([^\}]*)\}', re.I)
timestampExp = re.compile(r'\{([^\{]*)timestamp([^\}]*)\}', re.I)

超级奇怪的是,当我尝试使用原始字符串替换r'\1{}\2')时,而不是我的待办事项列表中的" 10件事,我得到" 34;我的待办事项清单上的东西":

templateString = r'{title} {- author} ({timestamp})'
self.title = "10 things on my todo list"
renamed = re.sub(titleExp, (r'\1{}\2' if self.title else '').format(self.title or ''), renamed)
# H things on my todo list ...

我当然尝试没有原始字符串:

templateString = r'{title} {- author} ({timestamp})'
self.title = "10 things on my todo list"
renamed = re.sub(titleExp, ('\\1{}\\2' if self.title else '').format(self.title or ''), renamed)
# 10 things on my todo list ...

但同样的事情发生了。

发生了什么?为什么原始字符串会搞定?我可以说它特意与数字有关,而且捕获组可能没有正确行事。

结束重现:

templateString = r'{title} {- author} ({timestamp})'

titleExp = re.compile(r'\{([^\{]*)title([^\}]*)\}', re.I)
authorExp = re.compile(r'\{([^\{]*)author([^\}]*)\}', re.I)
timestampExp = re.compile(r'\{([^\{]*)timestamp([^\}]*)\}', re.I)

title = "10 things on my todo list"
author = "Brandon"
timestamp = "2018-01-01 00:00:01"

templateString = re.sub(titleExp, r'\1{}\2'.format(title), templateString)
templateString = re.sub(authorExp, r'\1{}\2'.format(author), templateString)
templateString = re.sub(timestampExp, r'\1{}\2'.format(timestamp), templateString)

print(templateString)

# output:
# H things on my todo list - Brandon (P18-01-01 00:00:01)
# ^ ??                                ^ ??

# expected:
# 10 things on my todo list - Brandon (2018-01-01 00:00:01)

更多研究:

它似乎与替换字符串的第一个字符有关:

title = " 10 things on my todo list"
#.       ^ space
author = "Brandon"
timestamp = " 2018-01-01 00:00:01"
#.       ^ space

修复它......有点......

2 个答案:

答案 0 :(得分:0)

您可以将firebase: "Help" the the resulting favourites object should be favorites: { food: "Pizza", color: "Blue", subject: "recess", firebase: "Help" }, 置于"-"占位符之外,然后使用{author}

re.sub

输出:

import re
templateString = r'{title} - {author} ({timestamp})'
title = "10 things on my todo list"
author = "Brandon"
timestamp = "2018-01-01 00:00:01"
new_data = re.sub('-\s(?=\{author)', '', templateString).format(title=title, author=author, timestamp = timestamp) if not author else templateString.format(title=title, author=author, timestamp = timestamp)
print(new_data)

10 things on my todo list - Brandon (2018-01-01 00:00:01) 为空时:

author

输出:

title = "10 things on my todo list"
author = ""
timestamp = "2018-01-01 00:00:01"
new_data = re.sub('-\s(?=\{author)', '', templateString).format(title=title, author=author, timestamp = timestamp) if not author else templateString.format(title=title, author=author, timestamp = timestamp)
print(new_data)

答案 1 :(得分:0)

对于它的价值,如果我在不使用内联捕获组的情况下突破表达式,它的行为正确。我现在可以用这个作为解决方法,但我肯定会喜欢解释为什么......

templateString = r'{title} {- author} ({timestamp})'

titleExp = re.compile(r'\{([^\{]*)title([^\}]*)\}', re.I)
authorExp = re.compile(r'\{([^\{]*)author([^\}]*)\}', re.I)
timestampExp = re.compile(r'\{([^\{]*)timestamp([^\}]*)\}', re.I)

title = "10 things on my todo list"
author = "Brandon"
timestamp = "2018-01-01 00:00:01"

match = re.search(titleExp, templateString)
title = '{}{}{}'.format(match.groups()[0], title, match.groups()[1])
templateString = re.sub(titleExp, title, templateString)

match = re.search(authorExp, templateString)
author = '{}{}{}'.format(match.groups()[0], author, match.groups()[1])
templateString = re.sub(authorExp, author, templateString)

match = re.search(timestampExp, templateString)
timestamp = '{}{}{}'.format(match.groups()[0], timestamp, match.groups()[1])
templateString = re.sub(timestampExp, timestamp, templateString)

print templateString

# output:
# 10 things on my todo list - Brandon (2018-01-01 00:00:01)