Question

是否有一种优雅的方式来获取命名%s的名称 - 就像字符串对象的变量一样？像这样：

string = '%(a)s and %(b)s are friends.'
names = get_names(string)  # ['a', 'b']

已知的替代方式：

使用正则表达式解析名称，例如：

import re
names = re.findall(r'%\((\w)\)[sdf]', string)  # ['a', 'b']

使用.format() - 兼容格式和Formatter().parse(string)。

How to get the variable names from the string for the format() method

但是具有％s类似变量的字符串呢？

PS ：python 2.7

Answer 1

为了回答这个问题，你需要定义“优雅”。有几个因素可能值得考虑：

代码是否简短，易于记忆，易于编写和自我解释？
它是否重用了基础逻辑（即遵循DRY原则）？
它是否实现完全相同的解析逻辑？

不幸的是，字符串的“％”格式是在stringobject.c中的C例程“PyString_Format”中实现的。此例程不提供允许访问格式字符串的已解析形式的API或挂钩。它只是在解析格式字符串时构建结果。因此，任何解决方案都需要从C例程复制解析逻辑。这意味着如果对格式规范进行了更改，则不会遵循DRY并公开任何破解的解决方案。

PyString_Format中的解析算法包含一些复杂性，包括处理键名中的嵌套括号，因此无法使用正则表达式完全实现，也不能使用字符串“split（）”。如果没有从PyString_Format复制C代码并将其转换为Python代码，我没有看到任何在所有情况下正确提取映射键名称的简单方法。

所以我的结论是没有“优雅”的方法来获取Python 2.7“％”格式字符串的映射键的名称。

以下代码使用正则表达式提供涵盖最常见用法的部分解决方案：

import re
class StringFormattingParser(object):
    __matcher = re.compile(r'(?<!%)%\(([^)]+)\)[-# +0-9.hlL]*[diouxXeEfFgGcrs]')
    @classmethod
    def getKeyNames(klass, formatString):
        return klass.__matcher.findall(formatString)

# Demonstration of use with some sample format strings
for value in [
    '%(a)s and %(b)s are friends.',
    '%%(nomatch)i',
    '%%',
    'Another %(matched)+4.5f%d%% example',
    '(%(should_match(but does not))s',
    ]:
    print StringFormattingParser.getKeyNames(value)

# Note the following prints out "really does match"!
print '%(should_match(but does not))s' % {'should_match(but does not)': 'really does match'}

P.S。 DRY =不要重复自己（https://en.wikipedia.org/wiki/Don%27t_repeat_yourself）

Answer 2

你也可以这样做：

[y[0] for y in [x.split(')') for x in s.split('%(')] if len(y)>1]

Answer 3

不知道这本书是否符合优雅条件，但这是一个解析名字的简短函数。没有错误检查，因此格式错误的字符串将失败。

def get_names(s):
    i = s.find('%')
    while 0 <= i < len(s) - 3:
        if s[i+1] == '(':
            yield(s[i+2:s.find(')', i)])
        i = s.find('%', i+2)

string = 'abd %(one) %%(two) 99 %%%(three)'
list(get_names(string) #=> ['one', 'three']

Answer 4

此外，您可以将此% - 任务缩减为Formater - 解决方案。

>>> import re
>>> from string import Formatter
>>> 
>>> string = '%(a)s and %(b)s are friends.'
>>> 
>>> string = re.sub('((?<!%)%(\((\w)\)s))', '{\g<3>}',  string)
>>> 
>>> tuple(fn[1] for fn in Formatter().parse(string) if fn[1] is not None)
('a', 'b')
>>>

在这种情况下，我想你可以使用两种变形形式。

其中的正则表达式取决于您想要的内容。

>>> re.sub('((?<!%)%(\((\w)\)s))', '{\g<3>}', '%(a)s and %(b)s are %(c)s friends.')
'{a} and {b} are {c} friends.'
>>> re.sub('((?<!%)%(\((\w)\)s))', '{\g<3>}', '%(a)s and %(b)s are %%(c)s friends.')
'{a} and {b} are %%(c)s friends.'
>>> re.sub('((?<!%)%(\((\w)\)s))', '{\g<3>}', '%(a)s and %(b)s are %%%(c)s friends.')
'{a} and {b} are %%%(c)s friends.'

如何从python字符串中获取命名变量的名称

4 个答案: