Question

我想得到一个字符串模板可能在替换中使用的所有可能关键字参数的列表。

除了re之外，有没有办法做到这一点？

我想做这样的事情：

text="$one is a $lonely $number."
keys = get_keys(text) 
# keys = ('one', 'lonely', 'number')

我正在编写一个简单的类似Mad-lib的程序，我希望使用string.format或Template strings执行模板替换。我想写“故事”并让我的程序生成一个模板文件，其中包含用户需要生成的所有“关键字”（名词，动词等）。我知道我可以用正则表达式做到这一点，但我想知道是否有替代解决方案？我愿意接受string.format和string template的替代方案。

我认为可以解决这个问题，但我没有快速搜索过它。我确实找到了这个问题，reverse template with python，但这并不是我想要的。它只是重申可以使用re完成此操作。

修改

我应该注意$$是'$'的转义，并不是我想要的标记。 $$5应该呈现为“$ 5”。

Answer 1

如果可以使用string.format，请考虑使用具有string.Formatter方法的内置类parse()：

>>> from string import Formatter
>>> [i[1] for i in Formatter().parse('Hello {1} {foo}')  if i[1] is not None]
['1', 'foo']

有关详细信息，请参阅here。

Answer 2

string.Template类具有用作属性的模式。您可以打印模式以获取匹配的组

>>> print string.Template.pattern.pattern

    \$(?:
      (?P<escaped>\$) |   # Escape sequence of two delimiters
      (?P<named>[_a-z][_a-z0-9]*)      |   # delimiter and a Python identifier
      {(?P<braced>[_a-z][_a-z0-9]*)}   |   # delimiter and a braced identifier
      (?P<invalid>)              # Other ill-formed delimiter exprs
    )

就你的例子而言，

>>> string.Template.pattern.findall("$one is a $lonely $number.")
[('', 'one', '', ''), ('', 'lonely', '', ''), ('', 'number', '', '')]

正如您在上面所看到的，如果您使用大括号${one}，它将转到结果元组的第三位：

>>> string.Template.pattern.findall('${one} is a $lonely $number.')
[('', '', 'one', ''), ('', 'lonely', '', ''), ('', 'number', '', '')]

因此，如果您想获得所有密钥，您必须执行以下操作：

>>> [s[1] or s[2] for s in string.Template.pattern.findall('${one} is a $lonely $number.$$') if s[1] or s[2]]
['one', 'lonely', 'number']

Answer 3

您可以使用记录调用或默认字符的检测字典对其进行一次渲染，然后检查它所要求的内容。

from collections import defaultdict
d = defaultdict("bogus")
text%d
keys = d.keys()

Answer 4

尝试str.strip()以及str.split()：

In [54]: import string

In [55]: text="$one is a $lonely $number."

In [56]: [x.strip(string.punctuation) for x in text.split() if x.startswith("$")]
Out[56]: ['one', 'lonely', 'number']

Answer 5

你可以尝试：

def get_keys(s):
    tokens = filter(lambda x: x[0] == "$", s.split())
    return map(lambda x: x[1:], tokens)

Answer 6

为什么要避免使用正则表达式？他们的工作非常好：

>>> re.findall(r'\$[a-z]+', "$one is a $lonely $number.")
['$one', '$lonely', '$number']

对于模板，请查看re.sub，可以使用回调调用它来完成您想要的任务。

Answer 7

>>> import string
>>> get_keys = lambda s:[el.strip(string.punctuation) 
                         for el in s.split()if el.startswith('$')]
>>> get_keys("$one is a $lonely $number.")
['one', 'lonely', 'number']

从模板中获取密钥

7 个答案: