Question

(dir(__builtins__)) output:151

我用数字和字符串对它们进行了分类：

# categorize with number
number_dict = {}
for i in all_builtins:
    if type(eval(i)) not in number_dict:
        number_dict[type(eval(i))] = 1
    else:
        number_dict[type(eval(i))] += 1
# get number_dict
{<class 'type'>: 92, <class 'ellipsis'>: 1, ....}

# categorize with string
string_dict = {}
for i in all_builtins:
    if type(eval(i)) not in string_dict:
        string_dict[type(eval(i))] = i
    else:
         string_dict[type(eval(i))] += "," + i
# get string_dict
string_dict = {...<class 'str'>: '__name__', <class'_sitebuiltins._Printer'>: 'copyright,credits,license',<class '_sitebuiltins.Quitter
'>: 'exit,quit', <class '_sitebuiltins._Helper'>: 'help'}

如何使用列表或更高级的pythonic对内置类进行分类？

Answer 1

使用Counter和getattr而不是eval进行计数，以便这也适用于其他模块对象：

>>> import collections
>>> collections.Counter(type(getattr(__builtins__, name)) for name in dir(__builtins__))
Counter({<type 'type'>: 76, <type 'builtin_function_or_method'>: 52, <class 'site._Printer'>: 3, <type 'bool'>: 3, <type 'str'>: 2, <class 'site.Quitter'>: 2, <type 'NoneType'>: 2, <class 'site._Helper'>: 1, <type 'NotImplementedType'>: 1, <type 'ellipsis'>: 1})

第二个使用一个函数，将(k,v)的列表累积到k:[v]的字典中：

def accumulate(kv):
    d = {}
    for k,v in kv:
        d.setdefault(k,[]).append(v)
    return d

accumulate((type(getattr(__builtins__, name)), name) for name in dir(__builtins__))

以下是一个示例运行：

>>> accumulate((type(getattr(__builtins__, name)), name) for name in dir(__builtins__))
{<class 'site._Helper'>: ['help'], <type 'str'>: ['__doc__', '__name__'], <class 'site.Quitter'>: ['exit', 'quit'], <type 'type'>: ['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException', 'BufferError', 'BytesWarning', 'DeprecationWarning', 'EOFError', 'EnvironmentError', 'Exception', 'FloatingPointError', 'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError', 'ImportWarning', 'IndentationError', 'IndexError', 'KeyError', 'KeyboardInterrupt', 'LookupError', 'MemoryError', 'NameError', 'NotImplementedError', 'OSError', 'OverflowError', 'PendingDeprecationWarning', 'ReferenceError', 'RuntimeError', 'RuntimeWarning', 'StandardError', 'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'TypeError', 'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError', 'UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning', 'ValueError', 'Warning', 'ZeroDivisionError', 'basestring', 'bool', 'buffer', 'bytearray', 'bytes', 'classmethod', 'complex', 'dict', 'enumerate', 'file', 'float', 'frozenset', 'int', 'list', 'long', 'memoryview', 'object', 'property', 'reversed', 'set', 'slice', 'staticmethod', 'str', 'super', 'tuple', 'type', 'unicode', 'xrange'], <type 'NotImplementedType'>: ['NotImplemented'], <class 'site._Printer'>: ['copyright', 'credits', 'license'], <type 'bool'>: ['False', 'True', '__debug__'], <type 'NoneType'>: ['None', '__package__'], <type 'ellipsis'>: ['Ellipsis'], <type 'builtin_function_or_method'>: ['__import__', 'abs', 'all', 'any', 'apply', 'bin', 'callable', 'chr', 'cmp', 'coerce', 'compile', 'delattr', 'dir', 'divmod', 'eval', 'execfile', 'filter', 'format', 'getattr', 'globals', 'hasattr', 'hash', 'hex', 'id', 'input', 'intern', 'isinstance', 'issubclass', 'iter', 'len', 'locals', 'map', 'max', 'min', 'next', 'oct', 'open', 'ord', 'pow', 'print', 'range', 'raw_input', 'reduce', 'reload', 'repr', 'round', 'setattr', 'sorted', 'sum', 'unichr', 'vars', 'zip'], <class 'collections.Counter'>: ['_']}

这不是eval使用什么对象的问题，而是它如何查找名称。事实上getattr认为它的论点只是一个名字。但是eval会将1+evil()作为属性的名称处理，而不是作为调用。

在设置eval之后考虑setattr(__builtins__, '1+evil()', '')将要执行的操作：

>>> setattr(__builtins__, '1+evil()', '')
>>> getattr(__builtins__, '1+evil()')
''
>>> eval('1+evil()')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1, in <module>
NameError: name 'evil' is not defined

如果有evil函数，则会调用它。但由于没有定义这样的evil函数，我们得到NameError。

eval仅适用，因为__builtins__中的所有值都在globals()中。将__builtins__替换为任何其他模块后，eval将无法解析名称，除非您将vars(that_module)传递给eval作为其globals参数。

Answer 2

使用Counter。

In [2]: from collections import Counter

In [3]: Counter([type(eval(x)) for x in (dir(__builtins__))])
Out[3]: 
Counter({type: 92,
         ellipsis: 1,
         bool: 4,
         NoneType: 4,
         NotImplementedType: 1,
         builtin_function_or_method: 42,
         str: 2,
         _sitebuiltins._Printer: 3,
         function: 1,
         method: 1,
         _sitebuiltins._Helper: 1})

Answer 3

您可以使用defaultdict初始化词典。

from collections import defaultdict

number_dict = defaultdict(int)
string_dict = defaultdict(list)
for foo in dir(__builtins__):
    foo_type = type(eval(foo))  # or type(getattr(__builtins__, foo)) per @DanD (+1 to Dan)
    number_dict[foo_type] += 1
    string_dict[foo_type].append(foo)

>>> dict(number_dict)
{type: 92,
 ellipsis: 1,
 bool: 4,
 NoneType: 4,
 NotImplementedType: 1,
 builtin_function_or_method: 41,
 str: 2,
 _sitebuiltins._Printer: 3,
 function: 1,
 method: 2,
 _sitebuiltins._Helper: 1}

>>> dict(string_dict)
{type: ['ArithmeticError',
  'AssertionError',
  'AttributeError',
  'BaseException',
  'BlockingIOError',
  ...}

如果您希望string_dict包含所有函数的一个字符串而不是列表，只需将其添加到结尾：

string_dict = {foo_type: ", ".join(string_dict[foo_type]) for foo_type in string_dict}

Answer 4

您可以groupby与sort一起使用defaultdict与append循环效果相同。

import itertools

import string, pprint  # For example

pprint.pprint({k: list(v)
               for k, v in itertools.groupby(
                   sorted(string.letters, key=string.lower),
                   key=string.lower)})

给出

{'a': ['a', 'A'],
 'b': ['b', 'B'],
 'c': ['c', 'C'],
 'd': ['d', 'D'],
 'e': ['e', 'E'],
 'f': ['f', 'F'],
 'g': ['g', 'G'],
 'h': ['h', 'H'],
 'i': ['i', 'I'],
 'j': ['j', 'J'],
 'k': ['k', 'K'],
 'l': ['l', 'L'],
 'm': ['m', 'M'],
 'n': ['n', 'N'],
 'o': ['o', 'O'],
 'p': ['p', 'P'],
 'q': ['q', 'Q'],
 'r': ['r', 'R'],
 's': ['s', 'S'],
 't': ['t', 'T'],
 'u': ['u', 'U'],
 'v': ['v', 'V'],
 'w': ['w', 'W'],
 'x': ['x', 'X'],
 'y': ['y', 'Y'],
 'z': ['z', 'Z']}

对于您的具体示例：

get_type = lambda (k, v): type(v)
builtins = sorted(vars(__builtins__).iteritems(), key=get_type)
string_dict = {k: list(v) for k, v in itertools.groupby(builtins, key=get_type)}
pprint.pprint({k: len(v) for k, v in string_dict.iteritems()})

给出

{<type 'bool'>: 3,
 <type 'list'>: 1,
 <type 'builtin_function_or_method'>: 52,
 <type 'NoneType'>: 2,
 <type 'NotImplementedType'>: 1,
 <type 'ellipsis'>: 1,
 <type 'str'>: 2,
 <type 'type'>: 76,
 <class 'site._Printer'>: 3,
 <class 'site._Helper'>: 1,
 <class 'site.Quitter'>: 2}

Python3：如何在一个代码行中对所有151个内置函数进行分类？

4 个答案: